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Abstract. Given two positive integers n and k and a parameter t G (0, 1), we choose 
at random a vector subspace V„ C C fe ® C" of dimension N ~ tnk. We show that the 
set of fc-tuples of singular values of all unit vectors in V n fills asymptotically (as n tends 
to infinity) a deterministic convex set Kk,t that we describe using a new norm in R fe . 
CNj ■ Our proof relies on free probability, random matrix theory, complex analysis and 

matrix analysis techniques. The main result result comes together with a law of large 
numbers for the singular value decomposition of the eigenvectors corresponding to large 
eigenvalues of a random truncation of a matrix with high eigenvalue degeneracy. 
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1. Introduction 



In 112] j it was observed that if one takes at random a vector subspace V n of C k <S> C n of 
fVj relative dimension t for large n and fixed k, with very high probability, some sequences of 

£Lh 1 numbers in never occur as singular values of elements in V n as n becomes large. This 

result was used to provide a systematic understanding of some non-additivity theorems 
for entropies in Quantum Information Theory. We refer to the bibliography of [19] for 
more information on this topic. 

Our aim in this paper is to provide a definitive answer to the question of which sequences 
of numbers in occur or not as singular values of elements in V n . Our main result can 
be sketched as follows - for the statement with complete definitions, we refer to Theorem 
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■ Theorem 1.1. Let t G (0, 1) be a parameter and for any n,V n a vector subspace ofC k 

of dimension N ~ tnk chosen at random. Then, there exists a compact set K^t C 
such that any k-tuple A in the interior of K^^t occurs with high probability as the singular 
value vector of some norm one vector x G V n . Moreover, the probability that some vector 
v ^ Kk,t occurs as the singular value vector of some element y 6 V n is vanishing when 
n —7- oo. 



The statement of the above theorem, as well as any other result in this paper about 
singular values of vectors in a tensor product space, can be immediately translated into a 
statement about singular values of matrices, simply by fixing an isomorphism C fc <%> C n ~ 
-Mfcxn(C); note that the euclidean norm on C k <8) C n is pushed into the Schatten 2-norm 
on M kxn (C), i.e. ||X|| = y/Ti(XX*). 

Theorem 1.2. Let t G (0, 1) be a parameter and for any n, V n a vector subspace of 
-Mfcxn(C) of dimension N ~ tnk chosen at random. Then, there exists a compact set 
C such that any k-tuple A in the interior of K^ t occurs with high probability as 
the singular value vector of a matrix x G V n of Hilbert- Schmidt norm one. Moreover, the 
probability that some vector v ^ Kk,t occurs as the singular value vector of some Hilbert- 
Schmidt norm one matrix y G V n is vanishing when n — > oo. 
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Even though both formulations are completely equivalent, they are of interest to differ- 
ent areas of mathematics. We choose to work with singular values (or Schmidt coefficients 
as they are called in quantum information) of vectors because of the initial quantum 
information theoretical motivation. 

The set K^i is described with the help of a new norm on R fc , that arises from free 
probability theory. Restricted on it interpolates between the l l and the l°° norm. 

For the purpose of proving the above theorem, one first key technical result (Theorem 
I4.2p is a partial extension of a result of Haagerup and Thorbj0rnsen [23] to the case of 
random projections. The characterization of sequences that fail with high probability to 
occur as singular values of elements in V n follows from this technical result. It uses ideas 
that have been introduced in [15] . 

The characterization of sequences that occur with high probability as singular values 
of elements in V n is much more involved (we refer to this part of the proof of the main 
theorem as the proof of the second inclusion, whereas we refer to the previous part as the 
first inclusion). It turns out to rely not only on our first technical result, but also on a 
precise understanding of the eigenvectors of suitable random matrix models. 

In Random Matrix Theory, the asymptotic behavior of large random matrices is the 
main object of study, and the empirical distributions of the eigenvalues as a random set 
is arguably the most studied kind of statistics, together with, more recently, the statistics 
of the largest eigenvalues. To our knowledge, the eigenvectors had not been recognized 
so far as variables having a structured asymptotic behavior (with a few exceptions in the 
case of spiked random matrices, see e.g. [S] and references therein), although they have 
recently been studied for various models of random matrices (see [5J for a recent work in 
this direction). 

For the purposes of the proof of the second inclusion, we present in this paper a theorem 
that is of independent interest, as it shows that the eigenvectors of some random matrices 
are much more deterministic than one might expect. Our theorem can be summarized as 
follows (U(k) denotes the group of k x k unitary matrices): 

Theorem 1.3. Let A be a k x k positive semidefinite matrix with simple eigenvalues. Let 
v n be a sequence of numbers satisfying v n = o{n), and N ~ tnk (where t £ (0, 1)). Let 
Z n = II n (.A®I n )n n where U n is a random projection of rank N. Let y n be the eigenvector 
corresponding to the v n -th largest eigenvalue of Z n . Then, almost surely as n — ^ oo ; the 
(R k ,U(k)/U(l) k ) part of the singular value decomposition of y n converges to a limit made 
explicit in Theorem \5.'A 

Finally, we study the points at the boundary of the set K^t in Theorem 11.11 The 
boundary of the dual set is a real algebraic variety for small enough values of t, when 
intersected with the hyperplane ^ Aj = 1. In particular, we show that for some parameters 
t it is strictly convex, and study its faces for other values of t. Our techniques here rely 
on free probability theory, complex and convex analysis. 

The paper is organized as follows. In section [21 we introduce our model as well as 
some notation. Then, in section [3] we introduce a new norm via an operator algebraic 
construction and prove a continuity result that we use in section [3] to prove a convergence 
result for the norm of the product of random matrices. Section [5] is the main section of 
our paper, where we describe the limiting shape of the collection of singular values. In 
Section (UJ we study the set K^ t and its dual. 

2. Setup and notations 

2.1. Singular values of a vector subspace of a tensor product. The purpose of this 
paragraph is to introduce a subset Ky C M fc associated to a vector subspace V of a tensor 
product C k (g) C n . We always assume that k and n are integers, with k ^ n. This set is a 
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'local' invariant of the inclusion V C C k ® C n in the sense that it is not modified if V is 
modified by a unitary in U(k) ®U(n). 

The singular values of a vector x G C k <8> C n are non-negative numbers Ai(x) ^ ... ^ 
Afc(x) ^ such that 

k 

(1) x = ^ a/ Ai(x) ej(x) /i(x) 

i=i 

where e«(x) (resp. /i(x)) are orthonormal vectors in C k (resp. C ra ). These are the singular 
values of the matrix obtained by identifying a vector iGC'® C n with the k x n matrix 
obtained from x via the isomorphism C k ® C n ~ (C fe )* 0C n = A4fc Xn (C). If x is a unit 
norm vector in C nk , then A(x) = (Ai(x), . . . , Afc(x)) belongs to the set 

fc 

(2) At = {y G R k + : yi > y 2 > ■ ■ ■ > Vk and ^yi = 1}. 

i=i 

We have Aj; C A fe , where A k = {y £ : Ya=\V% = 1} is the (k - l)-dimensional 
probability simplex. We define the following particular vectors 

(3) l j k ~ j = (1,1 ,1 , 0,0 ,0 ) G M fe . 

j times k—j times 

We also introduce the set ly = M. k \ M.l k = {x G R fe : 3i,j with Xj / x^} of vectors 
with non constant coordinates. Let V be a subspace of dimension N of C fc C n , i.e. an 
element of the Grassmann manifold Gr7v(C fc (8) C n ). Let be the set of all singular 
values of norm one vectors x G V, 

(4) JSV = {A(x): xG V", ||x|| = 1} C A£. 

For technical reasons it will sometimes be convenient to replace it by Ky which is its 
symmetrized version under permuting the coordinates, Ky being a subset of A&: 

Ky = {(A CT (!), A CT ( 2 ), • • • , K(k)) '■ A G Ky,a G Sk}- 

An elementary but important property of Ky is that it has nice invariance properties. 
The following result is an easy consequence of the singular value decomposition. 

Proposition 2.1. Ky is invariant under 'local' rotations, i.e. if U\ G U(k),U2 G U{n) 
then 

Ky = K( Uim2 y v . 

2.2. Random Subspaces. The integer k and the real parameter t G (0, 1) are fixed 
throughout the paper. We are interested in a random sequence (V n ) n ^i of subspaces of 
V n C C fe (8> C n having the following properties: 

(1) V n has dimension N less than nk. N is a function of n such that ./V and n grow 
to infinity according to N ~ infc. 

(2) The law of V n follows the only probability distribution on the Grassmann manifold 
Gr7v(C fc (8) C n ) that is invariant under the action of the unitary group U{nk). We 
will refer to this probability measure as the invariant measure. 

We do not make any assumption about the correlation between the V^'s for various 
values of n. Whether they are correlated or independent does not affect our results. 
In this setting, we call 

K n ,k,t = Ky n 

and we study the sequence K nkt of symmetrical random subsets of A&, as n — > oo. The 
aim of this paper is to prove that K n k t exhibits a deterministic behavior as n — > oo. In 
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order to describe it, we need to review a few notions of free probability theory and complex 
analysis. 

3. FREENESS AND A NEW FAMILY OF NORMS ON R k 

3.1. Freeness. A * -non- commutative probability space is a unital *-algebra A endowed 
with a tracial state tp, i.e. a linear map cp: A — > C satisfying cp(ab) = ip(ba),ip(aa*) ^ 
0,99(1) = 1. An element of A is called a (non-commutative) random variable. Let 
Ai, ■ ■ . ,Ak be subalgebras of A having the same unit as A. They are said to be free 
if for all Oj 6 Aj i (i = 1, . . . , k) such that <p(a,i) = 0, one has 

<p(ai ■ ■ ■ ojfc) = 

as soon as ji / jz, 32 7^ J3 ; • • • , Jfc-i 7^ Jfc- Collections Si, 5*2, . . . of random variables are 
said to be free if the unital subalgebras they generate are free. 

Let (ai, . . . , ak) be a A:-tuple of self-adjoint random variables and let C{X±, . . . , X/~) be 
the free *-algebra of noncommutative polynomials on C generated by the k self-adjoint 
indeterminates X\, . . . , X^. The joint distribution of the family {aj}^ =1 is the linear form 

M(ai,...,a fc ) : C(A"i, . . . ,X k ) -> C 

P h> tp(P(ai, ■ ■ ■ ,o fc )). 

Given a fe-tuple (ai, . . . , a^) of free random variables such that the distribution of 
is /x ai , the joint distribution (i( ai ,...,a k ) is uniquely determined by the /U ai 's. In particular, 
^ ai+a2 and /i aiaa depend only on fj, ai and ^ a2 . The notations n ai +a 2 = H Ma 2 and 
f J -a 1 a 2 = Mai ^ Ma 2 were introduced in Voiculescu's works [32j[33]; operations EE and M are 
called the /ree additive, respectively free multiplicative convolution. A family (a™, . . . , a£) n 
of fc-tuples of random variables is said to converge in distribution towards (a%, . . . ,ai~) 
iff for all P G C(Xi, . . . ,X k ), /i(a™,...,a n )(-P) converges towards fJ-( ai ,...,a k )(P) as n — >• 00. 
Sequences of random variables (a") n , . . . , (a^) n are called asymptotically free as n — > 00 
iff the A;-tuple (a™,...,a^) n converges in distribution towards a family of free random 
variables. 

The following result was contained in |34| (see also |20|). 

Theorem 3.1. Let {£^1 }fceN be a collection of independent Haar distributed random 
matrices of M. n {C) and {Wu^~\k€R be a set of constant matrices of M n (C) admitting a 
joint limit distribution as n — > 00 with respect to the state <~p n = n _1 Tr. Then, almost 
surely, the family {U^ 1 \ }fceN admits a limit *- distribution {uk,Wk}keN with respect 
to ip n , such that u\, U2, ■ ■ ■ , {w\,W2, ■ ■ •} are free. 

3.2. Analytic transforms associated to free convolutions: definitions and re- 
minders of classical results in complex analysis. We start with the following clas- 
sical definitions: 

I) The Cauchy-Stieltjes transform (or Cauchy transform) of a finite measure /U on the 
real line: 

G n( z ) = / — 1^(*)> c\supp(/i), 

JR z — t 

where supp(/x) denotes the topological support of (j,. If fj, is a positive measure, then 
Gn maps the upper half into the lower half of the complex plane, and Ga(z) = G„(z). 
Moreover, /x(R) = \im y ^ +OQ iyG^iy). 
II) F^(z) = 1/Gfj_(z), z £ C \ supp(/x). If the positive measure \i has compact support, 
then there exists a unique positive measure p on the real line, whose support is 
included in the convex hull of supp(/x) so that 

/i(M) (a*(K)) 2 Jut-z 
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This is a particular case of the so-called Nevanlinna representation of F„ [1, Equation 
3.3]. We shall almost exclusively be concerned with the case when p(M.) = 1 and 
supp(/x) is a compact subset of [0, +oo). In that case, the total mass of p equals the 
variance VAR(p) of p: p(M) = f s 2 dp(s) — (J sdp(s)) . 

III) The moment generating function of a probability p supported in [0, +oo) is 

f zt 

= / i 1 dfJ,(t), z £ C \ (l/supp(/i)). 

J[0,+oo) 1 ~~ zz 

It maps upper and lower half-planes into themselves. It will be useful to note 

(5) M*) = ~g> Q) - h M°) = o. 

IV) To compute free multiplicative convolutions of probability distributions on [0, +oo) 
Voiculescu introduced the S'-transform. It is defined on a small enough neighborhood 
of zero as 

S?(z) = -^?p~ 1 (z), 

whenever p ^ 5q is a compactly supported probability measure on [0, +oo). It 
satisfies the equation 

(6) S^ u (z) = S ft {z)S v {z) for \z\ small. 

From now on, unless otherwise specified, whenever we refer to ippP 1 -: we refer to the 
inverse of tp^ around zero and to its analytic continuation along the real line. It is of 
interest to us to give a better description of the domain of injectivity of ipu an d the 
image of this domain. A direct computation (see also [10]) shows that Qip'(z) > 
for any z in the upper half-plane for which Jiz < l/[/z], where the notation [p] is 
introduced in (JT]). Since ipft(z) = ipfi(z) and tp^ preserves upper and lower half- 
planes, we conclude that tp^ is injective on {z G C: < l/[/x]}. On the other 

hand, + iy) = J i-tffl) dpit) = J dp(t) + iy J tW+ l_ tx) - 2 dp(t). 

We easily observe that Qip^(x + iy) > 2 (x 2 +y 2 +i) I T+j?dp(t)., and, in particular, 
Qip fl (x + i) > 2(3^+1) / t^+idp(t) for all x € K. This gives us a bound on the 
"thinness" of the domain of ip~ x in terms of the integral f ^qr[ dp(t). 

These transforms have properties that make them important in the study of free convolu- 
tions. 

Finally, we recall for the convenience of the reader a few classical results of complex 
analysis that we will need in the forthcoming proofs. 

I) The unit disc in the complex plane (and any conformally equivalent domain) can 
be made into a metric space with a natural metric (the so-called pseudohyperbolic 
metric) with respect to which any analytic self-map of the unit disc becomes a con- 
traction. This is essentially the Schwarz-Pick Lemma, which we formulate here for 
the upper half-plane: If / is an analytic self-map of the upper half-plane, then 

Ssz, > 0. 



/(*) 


-fW 




z — w 


/(*) 


-fW 


z — w 



Equality holds at a given pair of points if and only if / is a Mobius map. 

In addition, if zq is a fixed point of /, and / is not the identity mapping or a 
rotation, then zq is the unique fixed point of / and |/'(^o)| < 1- The reader can find 
a wonderful presentation of this subject (and much more) in the first chapter of |22j . 
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II) There are self-maps of the upper half-plane that have no fixed points in their domains. 
However, one can generalize this notion so that all such maps have a fixed point. In 
order to do this, we should define the notion of non-tangential limit. The function / 
defined on the upper half-plane has a non-tangential limit d at the point x G RU{oo} 
(and we shall write that as <\im z ^ x f(z) = d) if the limit of f(z) exists and equals 
d whenever z approaches x inside any closed cone T included in {x} UfzGC: Qz > 
0}. This way one can also extend the notion of derivative: the Julia-Caratheodory 
derivative of / at a point x G R where < ]im z -^ x f(z) = d G R is defined as 

t'f ^ f( z )~ d 
f (x) = < lim . 

z-KB Z — X 

Remarkably, when the Julia-Caratheodory derivative of the function / is finite, then 
f'(x) = <lim z -> x f'(z). If x = d = oo, then the correct definition of the Julia- 
Caratheodory derivative is < lirn^oo j^y. It is known that f'(x) G (0, +oo]. It turns 
out that there can be infinitely many points d G RU {oo} so that < linxj-^ f(z) = d. 
But if / has no fixed point in the upper half-plane and is not a Mobius map, then 
there exists exactly one point dGMU {oo} so that 

<lim/(z) = d and /'(d) e (0,1]. 

z— >d 

A complete and very accessible reference for these results is [31J. 

III) Non-tangential limits of an analytic map / on the upper half-plane can be said to 
uniquely determine /. Indeed, according to a theorem due to Privalov, if there exists 
a set EcRof non-zero Lebesgue measure so that < lim^a; f(z) = for all x G E, 
then / is identically equal to zero [151 Theorem 8.1]. 

IV) Conveniently, atoms of a probability measure \x can be easily expressed in terms of 
the Julia-Caratheodory derivatives of and ip^ as 

<Mm z ^ d F^z) = 0, Fl{d) = a 

<l*W/dl^ = l, (^)'(l/d)=da 

if and only if fj,({d}) = 1/a. In particular, if d is an isolated atom of fj,, then both 
and Vv/(1 + Vv») extend analytically around d. 
To conclude, let us note that if \x is the distribution of the self-adjoint random variable 
y G A with respect to ip, then 

G^(z) = f((z-y)- 1 ), z<£a(y). 

This will be important in our study of norms of operators via transforms. It follows 
from the above equality that = max{sup supp(/x), — inf supp(/i)}, so that ||y|| can be 
described also as the maximum between the largest xERin which is not analytic and 
minus the smallest x G R in which is not analytic. If y is a positive operator, then 
||y|| = supsupp(/i) and this number coincides with the largest x G R in which G^ is not 
analytic. 

In terms of the transforms F and ip/{l + if)), we have the following characterizations of 

\\y\\- 

||y|| = max({x G R: F^x) = 0} U {x G R: F^ not analytic in x}), 

and 

min({x G R: ^ M (x)(l + ip^x))' 1 = 1} U {x G R: ^(-)(1 + ^(O)" 1 not analytic in x}). 
We shall denote 



(7) 



= max{|w| : v G supp(/i)}. 
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3.3. The (i)-norm: definition. We introduce now a norm on M. k which will have a very 
important role to play in the description of the set K n ^,t in the asymptotic limit n — > oo. 

Definition 3.2. For a positive integer k, embed M. k as a self-adjoint real subalgebra 1Z of 
a Hi factor A endowed with trace <p, so that <p((xi, . . . , x^)) = (x\ + • • • + Xk)/k. Let p t be 
a projection of rank t £ (0, 1] in A, free from 1Z. On the real vector space M. k , we introduce 
the following norm, called the (i)-norm: 

(8) ||x|| (t ) := WptxptWoo, 

where the vector x £ M fc is identified with its image in 1Z. 



The fact that \\'\\n-\ is indeed a norm deserves a proof, that we postpone to Lemma [ 
in the next subsection. Before that, we show that complex analysis stands as a powerful 
tool to study the distribution of ptxpt, and therefore of the (t)-norm. 

Note that the distribution of the random variables x and pt are, respectively fi x = 
k^ 1 Yli=i an d (J,p t = (1 — t)So + tSi . Therefore, in the framework of free probability and 
following the notation of Equation Q, \\x\\u\ = \p, x B fj, Pt ] (recall definitions of operations 
EE and M from Section f3.ip . 

In the next proposition, we provide a free probabilistic description of the (i)-norm, 
which will turn out to be very useful. This result, first proved in |28| . is contained in [29], 
Lecture 14. 

Proposition 3.3. The distribution Ht~ 1 p t xp t °f the (non- commutative) random variable 
t~ 1 p t xpt in the Il± factor reduced by the projection p t is related to the distribution fi x of 
x in the non-reduced factor by the equation 

(9) ^- W =^V*, t€(0,l], 

where EE denotes the free additive convolution of Voiculescu. Hence, \\x\\r t \ is t times 
the maximum between the upper bound and minus the lower bound of the support of the 
probability measure /j^ 1 ^ . 

It is possible to express the distribution of ptxpt in terms of the distribution of x, after 
the method described in [5j[6]: 

Proposition 3.4. Denoting G^z) = J R (z — t) -1 dixit) the Cauchy-Stieltjes transform of 
a measure fi and F^{z) = 1/G^(z), the following relations hold 

(10) F ^ z ) = ^("i/tC*)). = tz + (l-t)Fm /t (z), 

so that the function uji 1 1 is the right inverse of the function Hy t {w) = jW+{\ — jj F^ x (w), 
for > 0. Moreover, oj^u extends continuously to the closure of the upper half-plane. 

3.4. The (t)-norm: first properties. We first prove properties about the (i)-norm that 
do not rely on complex analytic tools. 

Lemma 3.5. The map x — > \\x\\u\ defines indeed a norm. The it) -norm has the 

following properties: 

(1) It is invariant under permutation of coordinates 

||(xi,X 2 ,...,a;jfc)|| (t) = \\{Xa(l),Xa(2), ■ ■ ■ , x a(k))\\ {t} V<7 € S k . 

(2) For all s ^ (resp. s ^ 0) and for all vectors x for which \\x\\/ t \ is achieved at the 

FRI If 

upper (resp. lower ) bound of the support of [i x , 

x + s(l k ) 



(t) -\\x\\ (t) +8, 



(resp. \\x - s(l fc )||( 4 ) = lkll( t ) + s). 



s 
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(3) The it) -norm is determined by its restriction to the ordered probability simplex A^. 

(4) Whenever t > 1 — -j- we have \\x\\^ = Hx]^ 

Proof. The fact that ||Aa;L^ = |A| ||x||m follows from the definition. The triangle inequal- 
ity follows from 

\\pt{x + y)pt\\ 00 = \\ptxp t + PtyptW oo < \\ptxpt\\ + \\ptypt\\ = \\x\\ {t) + . 

Now, assume that \\x\\^ = 0. This is equivalent to ptxpt = 0. In turn, this is equivalent 
to ptxptxpt = 0, because x = x* . This is equivalent to (fiptxptxpt) = because tp is 
faithful and ptxptxpt is positive. But a direct computation shows that (p(ptxpt%Pt) = 
t{tipix 2 ) + (1 - t)<pix) 2 ). Since t G [0, 1], this can be zero iff t = or x = 0. 

The invariance under permutation follows from the fact that the moments of ptxpt are 
symmetric functions in Xi, so this proves point (1). 

Point (2) follows from the fact that pt(x + sl)pt = Ptxpt + spt and from functional 
calculus. 

The third point is a direct consequence of the second one. 

Writing x = a\q\ + • • • + a k q k , x reaches its norm a k on a projection q k of trace at 
least l/k, so ip(inf{p t , q k }) > ip(p t ) + </?(<?&) - 1 > 0, and hence p t xp t > a k m£{p t ,q k }, so 
llptxptH = H^Hoo = a k . □ 

It might be worth noting that with a little extra effort one can show that the equality 
\\ x \\{t) = \\ x \\oo from the above proposition holds also when t = 1 — |. 

In general it is difficult to explicitly compute the (t)-norm. We gather in the next 
proposition some important properties that can be obtained with methods of complex 
analysis. 

Proposition 3.6. The it) -norm has the following properties: 

1. For any x 6 

(11) \ IMI(t) = \w x + U - ~j f^W, 

where w x is the largest in absolute value solution to the equation 

(12) ^h(^H-t^)=°- 

Moreover, the map t \— > ||a?||(t) is non- decreasing on (0,1]. 

2. For all j = 1, 2, . . . , k, one has 



(13) (P0*-*) 




- 2tu + 2^/tuil - t)il - u) ift + u<l, 
(t) I 1 ift + u^l, 



where u = j/k. 

Proof. As it is more natural in probabilistic terms to do it, we shall make the change 
of parameter s = 1/t. Note that in terms of probability measures, [i x is purely atomic 
and compactly supported, hence G^ x is a rational function analytic on a neighbourhood 
of infinity which maps R U {oo} into itself. Moreover, the radius of convergence around 
infinity for G^ x equals \\x\\oa ( m the sense that G^ x {z) = Y^m=o[J t n dfi x (t)) z~ n _1 for 
\ z \ > \\ x \\oo)- It follows that F„ is also a rational function which maps MU{oo} into itself. 
Moreover the Nevanlinna representation [Tj Equation 3.3] of F^ x reads 



F IMx (z)=a + z+ [ -^—dp(t), 



where a = — J tdfji x {t) and p is a compactly supported purely atomic positive measure 
on the real line with total mass p(M) = YAR(p, x ). A direct computation shows that 
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||x||oo is th e largest, in absolute value, solution of the equation F^^v) = 0, and, moreover, 
F„ x (z) — z is analytic around infinity, with a radius of convergence strictly greater than the 
radius of convergence of Gn x (in the sense that Fn x (z) — z is analytic on the complement of 
a disc of radius strictly smaller than the one corresponding to G^ x ). This last statement 
is clearly true for any probability [i for which [fj] = max{|i>| : v G supp(/i)} is reached at 
an isolated atom of /i. 

Thus, as HpiXptll = ^max{|a|: a G supp( / u^ s )}, it follows that ||a;|L t s /t coincides with 
the largest in absolute value real number v so that either F..m 3 (v) = or F..m s is not 
analytic in v, with the first case corresponding to the situation in which the maximum is 
reached at an isolated atom of ^ s . The first statement follows from the above observation 
and from the Definition 13.21 and the Proposition 13.31 

Denote J the interval in R containing arbitrarily large positive numbers on which F^ x 
is analytic; clearly, J D (||#||ooj +oo). Also, denote J s the similar interval corresponding 
to F^m s . From the Nevanlinna representation, we gather the following: 

• For any s > 1, z G J s , 

F^s(z) < z- J td^ s (t), F^s(z) > 1, F^ a (z) < 0. 

• If (//^ s ) ac denotes the (necessarily non-zero whenever s > 1) absolutely continuous 
part of then 

inf J s = max{u : v G supp(/i^ s ) ac } 

= max{u : F^Bs not analytic in v} 
= max{v : u s not analytic in v}; 

• Let us denote x(s) = inf J s . Then 

x(s) = sv(s) + (1 - s)F^ x (v(s)), s > 1, 
where v(s) is the largest solution of the equation F'(v) = 

Only the last item needs some justification: it follows from equation (|10p that the domains 
of analyticity of uj s and F^m a coincide. Moreover, oj s being the right inverse of H s , it follows 
that H s (u s (z)) = z for all z G J s and u s (H s (z)) = z for all z G H S (J S ). Computing the 
derivative H' s (z) = s + (1 — s)F' tlx (z) and using the first item above, it follows that the 
first obstacle for the analytic extension of uj s along R coming from +oo is the point 
H s (v(s)) with v(s) described in the last item above. Then, x(s) = H s (v(s)) = sv(s) + (1 — 
s)F,Jv(s)). 

Elementary implicit differentiation gives 

x'(s) = v(s) + sv'(s) - F^Ms)) + (1 " s)F' tlx (v(s))v'(s) = v(s) - F^ x (v(s)). 

We have used above the fact that F' flx (v(s)) = -^j. Then 



x(s) \ sx'(s) — x(s) 



s 2 



sv(s) - sF^ivis)) - sv(s) - (1 - s)F lix (v(s)) 



s 2 



s 2 



As noted, if is achieved at the upper bound of the support of the distribution of 

Ptxpt, then \\x\\, t ^ = whenever \\x\\u\ is not achieved at an atom of fj^ s . 

To complete the proof, we observe that without loss of generality, we may assume that 
||a;||( t ) is achieved at the upper bound of the support of our measure. If this upper bound 
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coincides with an atom of the measure, then we have already seen in Lemma 13.51 that 
\\ x \\(t) = IMloo- If that is not the case, then \\x\\/ t \ = We claim that F^ x {y(s)) > 0. 
Indeed, if F flx (v(s)) < 0, then there must be some point zq > v(s) so that F^ x (zo) = and 
hence H s (zq) = szq. But v(s) = u s (x(s)), uj s is defined right of x(s) and szq > x(s), hence 
zq = u) s (szo) = ~(szo) + (l — ~) F^msiszo) implies i^m* (s2o) = 0, so szq is an atom for 
/j^ s , a contradiction. Thus, the function s i-> ||xL^ is non-increasing, strictly decreasing 
when \\x\\r t \ is reached at the boundary of the support of the absolutely continuous part 
of/**". 

Note that our proof does not exclude the possibility that, as t decreases, \\x\\u\ could 

switch from being achieved at the upper bound of the support of ^ s to being achieved at 
its lower bound. However, the argument above still holds even if such a switch happens. 

For the last item, see [33], example 3.6.7. This is one of the few cases when an exact 
expression for the (£)-norm is known and it has been heavily used in [19]. □ 

In Figure HJ the ball for the (t)-norm is plotted for k = 2. Note that the shape of the 
ball depends only on the parameter 



x t 




if t < \, 



whose dependence in t is also plotted in the right-hand side subfigure. 




Figure 1. The unit ball for the (t)-norm in R 2 . 



Let us mention that the solution to the equation F^ x (w) = corresponds to an atom, 
that is, if the solution w x is of F /Xx (w) = 0, the norm t is achieved either at an atom of 
/i^* or at a point where the density of this measure is infinite. Atoms of the probability 
measure /j,^ 1 ^ have been fully described in [5] by the formula 

(14) /x=V* ({a}) = max jo, ({ta}) - ~ + 1 j . 

Let us record for further use that the above implies that when t < \ the measure ^ 
is necessarily absolutely continuous with respect to the Lebesgue measure on R. 
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3.5. The (t)-norm: continuity. This section contains a technical result for the conti- 
nuity of the (t)-norm in t and of the M operation. Proposition 13.81 is the main result here 
and it has independent interest in free probability. In the rest of this paper, we shall use 
a simpler incarnation of this result in the form of Corollary 13.91 

Proposition 3.7. Assume that p is a compactly supported probability measure on [0, +oo). 
Then the map [l,+oo) 3 t i— > [p mt ] S (0, +oo) is continuous, algebraic outside a bounded 
discrete subset o/(l,+oo). Moreover, 

(15) < [n m+£ ] - [p m ] < t (yFVAR(jJj + e VAR(p)^j , t > 1, e > 0. 

Proof. As noted before, [p mt ] is the largest positive number where either F mt is not 
analytic, or F^mt takes the value zero. We shall use equations (|10p in order to analyze this 
number. It follows easily that F mt is not analytic in xo if and only if ujt is not analytic in 
xq. This latter function is the right inverse of 

H t (w) = tw + (l-t)F fl (w) =w + (t-l) I sdfi(s) + (l-t) [ —^—dp(s), 

J[0,+oo) J[0,+oo) S — W 

according to the Nevanlinna representation of F^. 

One can see directly that for any w in the interval of analyticity of Ht included in 
([p], +00), we have H t (w) > w, H' t (w) = 1 + (1 — t) J r^jp dp(s). For simplicity, we 
shall denote xt the largest point in the real line in which ut is not analytic. Thus, Ht 
maps the interval [max{(i7 t / ) _1 ({0}), [/?]}, +00) bijectively onto [a^+oo). For t > 1 large 
enough, it is clear that [max{(i7 t / ) _1 ({0}), [p]}, +00) = [max(i^) -1 ({0}), +00), and so the 
correspondence t 1— > max(f^) -1 ({0}) is clearly algebraic (in fact analytic). The relation 
H t (ujt(x)) = x implies that xt = iJt(max(iIj) -1 ({0})), which is an analytic function. 
As t decreases towards 1, it may happen (whenever lim^p] f T^z^p dp(s) < +00) that 

max(^) _1 ({0}) either does not exist, or is no greater than [p]. We shall note that in this 
case there is a to = 1 + ( nm «4[p] / ( s - w y± ^P( s )) so that the function 1 1->- xt is analytic 
on (to, +00) and extends continuously to to- On the interval [l,to] we have, by the same 
relation H t (ut(x)) = x, 

x t = H t {[p\) = t[p] + (1 - t) lim F„(w), 

which is again an analytic (linear!) map of t. We note that lim^jui F fl (w) must be finite 
as long as lim^ [p] / {s ^ w)2 dp(s) < +00. 

This has determined the analyticity of the correspondence between t and the largest 
point of non-analyticity of cjj, and hence of F^mt- We have however remarked at the 
beginning of our proof that this point does not necessarily coincide with [p m ], and that 
moreover, the case in which it does not coincide corresponds to the case of an isolated 
atom of p mt . Atoms of p mt have been however described in equation (|14|) : it follows 
that the correspondence remains linear for t in the interval [1, (1 — /i({a})) -1 ]. Moreover, 
when t = (1 - ^({a}))" 1 , we have H' t {a) = t + (1 - t)F^(a) = t + (1 - t)/p({a}) = 
(1 — //({a})) -1 + (1 — (1 — ^({a})) -1 ) / p({a}) = (derivatives understood either in their 
proper sense, or in the Julia-Caratheodory sense), so at t = (1 — ^({a}))" 1 we encounter 
a breach of analyticity of ut at the point ta. 

This allows us to conclude that t h-> [p mt ] has two possible regimes of evolution, ei- 
ther linear or according to -fft( m a x (-£^)~ 1 ({0}))> and the two regimes "glue" continu- 
ously. This guarantees continuity of t 1— > [p mt ] on (0, +00). If the linear evolution oc- 
curs at all, then continuity at t = 1 is obvious. If it does not, then we observe that 
limt_;.i max(.£fj) -1 ({0}) = [p] = [p], and moreover we can specify, by the Nevanlinna 
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representation, that 



< max^r^O}) - [p] < 
Then 




(t-l) / s'dii(,) 



s dp(s) 



[ / x Ht ] = ^(max( J ff t ')- 1 ({0}))=max(i/ t ')- 1 ({0}) + (t-l) J a-— —-±-——dp{ a 



so it is enough to estimate (t-l) J s _ max (jj/)-i({ }) d P( s ) 



Recalling the equation de- 



termining max(i^) 1 ({0}), namely / (s _ max (#/)-i({ }))2 = I 1 !' and noting that 

' \ 2 1 * 

dp(s)) < p(R) J TTz^rm^mm d P( s )> we get 



s-max(^)- 1 ({0}) U, f J \ a ) ) ~ J ( s - max (i/ t ') _1 ({0})) 2 

(t " 1) /^^* (s) ) < (t " 1)2 y ( S -max(^( { 0})) 2 ^ s) 

= (i-l)p(K)- 

We obtain 

< - [//] < V(* - + (* " 

This, together with the fact that the variance of p st equals t times the variance of p, 
guarantees that 

< [p m+e ] - [p mt ] < t(yfo(Tj + ep(R)). 
Since p(M) = VAR(p), this concludes our proof. □ 

Note that, while the estimate provided by the above lemma is indeed optimal at t = 1, 
it is not optimal throughout (l,+oo). However, it will serve our purposes. Also, it is 
worth mentioning that the correspondence t i— > [p mt ] may fail to be analytic on (l,+oo) 
only due to a "phase transition" from a linear to an essentially inverse quadratic regime. 

Next we address the problem of continuity for the upper bound of the support of the 
multiplicative free convolution of two probability distributions on the positive half-line. 
More precisely, assume that there is a topological space X and a pair of functions f,g:X—> 
(A + , (p) , where A + denotes the set of positive elements in the non-commutative probability 
space (A, <p). Assume that /, g are weak* continuous (meaning that X 3 £ h-> Pf($) 
is continuous from the topology of X to the weak topology on the space of probability 
distributions compactly supported on [0, +oo), and the same for g), and in addition the 
maps X 3 £ i— > ||/(£)||, A~ 3 £ \-t \\g(£)\\ are continuous. As noted before, = [pf^)], 

and we shall use the two notations interchangeably. 

It was noted before that [/•*/(£)] coincides with max{x G R: G> /{?) not analytic in x}. 
Equation ([5]) allows us to re-phrase this in terms of the moment generating function as 

= minjx G R: not analytic in x\. 

Let us recall that for any p ^ Sq supported on the positive half-line, ip^ is strictly increasing 
on the interval (— oo, l/[p]), so lim^/M ip^x) exists in (0, +oo]. We shall denote it by 
ip^(l/[p\). In particular, the inverse function ip' 1 of ip^ is defined on (p({0}) — l, ip^(l/[p])], 
monotonic, and takes values in (— oo, However, it is clear that ip" 1 might have an 

analytic extension beyond ip^l/lp]); indeed, that would correspond to the case when 
(ip~ /(VViCVM)) = 0- Thus, we can give a description of [p] in terms of ipZ 1 ' 

— = min{^ 1 (x): (ip^ 1 )'^) > OVr < x,^ 1 not analytic in x or (ip^ 1 )' {%) = 0} . 

(The case x = +oo is not excluded.) 
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The following proposition is concerned with the continuity of the correspondence X 3 
£ i->- or, equivalently, the correspondence X 3 £ h- >■ where the 

sets /(X) and <7(-X") are assumed to be free with respect to ip. For mere convenience, we 
assume X to be a metric space. We shall denote by Mft£\ the largest positive number with 
the property that extends analytically to a complex neighbourhood of the interval 

(/•*/(£) ({0}) ~~ 1) Mf^). We will assume that extends continuously as a real function 

to Mf(Q and we will denote by ^T~ f) the continuous extension 

^) ( " ) = i r X^» (r) ^ X " M/(?) 

Proposition 3.8. Let X be a metric space, (A, ip) a non- commutative probability space 
and f,g: X — > (A + \ {0}, ip) two norm-bounded functions that take values in free subalge- 
bras of A satisfying the following conditions: 

(1) The correspondences X 3 £ h-> fifr^\,fj, g r^\ are weakly continuous; 

(2) The correspondences X 3 £ h-> ||/(£)||, 115(011 G (0, +oo) are continuous; 

(3) The correspondences X 3 £ h-> Mfrg\,M g (g\ £ (0, +oo] are continuous; 

(4) The correspondences X 3 £ i— >■ ip^j^,tp^^ are continuous in the uniform norm, 
in the sense that for any ^6 1, 

& 1 fe/ 1 (oW-^/ 1 (eo )( r )l =0 - 

q— r^O rG^0,+oo] 

(^5j If V^t/^ ) *s analytic on some complex neighborhood of [0,r], then there exists a 
neighborhood U of £o aric ^ a complex neighborhood V of [0, r] so i/iai V^ 1 ^ is 
analytic on V for all £ 6W. Same statement is required to hold for g. 
Then the correspondence X 3 £ \- > [///(£) %(£)] G (0, +oo) is continuous. 

Before starting the proof, we should mention that, as weak continuity for / (condition 
(1)) is equivalent to continuity in the topology of the uniform convergence on compacts for 
ipuftn, if {£n}neN Q X converges to £o and tpZ;,. ,tb7.} rt: have a common domain, then 
ippj^ j converges to V^ 1 ^ } uniformly on compacts of the common domain. Condition 
(5) is devised in order to efficiently exploit this property. Condition (4) is a bit stronger 
than it appears: it says that if £ n converges to £ in X and r n E [0,Mj(g n )] converges 
tore [0,Mfig)], then %f (l _ n) { r n) converges to tp^j a) (r) as n -)■ oo. Indeed, l^^, (r) - 
fe/co ( r «)l ^ as n ^ oo by the continuity of , and 1^"^ (r n ) - Vv/ (en) (r„)| ^ as 
n — > oo by condition (4). In addition, our convention for doing arithmetics with infinity 
are oo+ a real number = oo, and oo — oo = 0, so that the sets {x : ( x ) = +00} and 

{x : "ipftf^ )( x ) = must coincide when n is large enough. 

Proof. The statement of the proposition is local in nature: thus, let us choose £0 £ X 
and an arbitrary sequence {£ n }neN Q X converging to £o- It should be recorded that 
condition (1) and the weak continuity result of Bercovici and Voiculescu [10J for free 
multiplicative convolution implies that liminf n ^oo[/tj(£ re ) M — [/"/(Co) ^ Mffteo)] ( we 

might "lose," but not "gain" support when passing to weak limit). We shall prove that 
]im n ^ 00 [fif^ E (J, g (£ n )] = [/"/(§„) ^ /"ate))]- In or der to do that, we shall use ©, the S- 
transform property of Voiculescu. This translates, in terms of the moment-generating 
function, in 
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This relation holds for z in the interval bounded below by max {/4/(£)({0}), ({0})} — 1 
and above by the minimum between the domains of ibT, 1 , , and tb~\ > viewed as inverses of 
the corresponding functions. However, there are many circumstances in which the above 
equality can be continued analytically (as complex functions) further along the positive 
axis. The maximum domain in M is the interval (max {m/(£)({^})> Mgff) ({0})} — 1 , , 
where is no smaller than the least of the upper bounds Af/(£) , M g rg\ of the domains 

of and In that case ' V[m/(o ec * uals the either ^ (0 ^ 9(0 ( M e) or 

^/(qB^O^)) where x f is the smallest critical point of ip'* ^ & , if existing. 

For simplicity, we shall denote a ra = ,,b n = ib~ l , .c n = ih~ m „ , with the 

obvious changes when n is replaced by or simply eliminated. We shall split the proof in 
two cases: 

Case 1: There exists a point X£ > in the domain of Co so that c' (x^ ) = as a 
complex function. Without loss of generality, we may assume that this point x^ is the 
smallest satisfying this condition, so that co(xg ) = l/[fifr^ \ Mpk g ^ \] (and thus Co extends 
analytically on a complex neighborhood of [0, x^ ]). 

Thus, by the S'-transform property, there is a neighborhood of [0, x^ ) on which ao and &o 
extend analytically. Indeed, assume towards contradiction there exists a point r £ (0,x^ ) 
so that, say, ao does not extend analytically to it. 

Our hypothesis for Case 1 guarantees that V/i /Wo) B/i ff(fo) ([0, 1/[m/(£ ) ^^(Co)]]) = % x Zo\ 
(bijective correspondence) and the only obstacle to the analytic extension of ipn f ^ 
to l/[^j(£ ) M Hg(^ )\ is the zero derivative of Co in X£ . If we replace in the moment- 
generating function version of the S'-transform equation (given above) the variable z by 
^/ Ko) ia/i, (4b) (z) we obtain 

z Tb m (ViTT = ao( ^/(«o)^ 9 (co)( z )) 6 o(^/( 4 o)^ 9 «o)( z ))- 

Denote for convenience loi = a o W/ ( e o) ^ 9(i = o) , w 2 = &o ° VVt/Ko^AMeo)" It has been shown 
in [13] that ojj extend analytically to C \ [0, +oo), preserve C + and increase the argument 
of the variable (argujj(z) > argz, z G C + ), and in [3] that their restriction to the upper 
half-plane extends continuously to R. In particular, ao,&o extend continuously to [0, X£ ]. 
Moreover, since VV/(£ )^ 9 («o) * s rea ^ on ^' ^/^/te)) ^ ^s(£o)]]> w i' w 2 must also be real on 
this same interval (see also [3]), and thus ao,&o are continuous real functions on [0, X£ Q ]. 
A direct application of the Schwarz reflection principle guarantees that ao,&o extend an- 
alytically to a neighborhood of [0, X£ ) in C, as claimed. It should be noted in addition 
that, as ujj preserve half-planes (see US]); both ujj are analytic on [0, l/[/iyv£ ) M H g (£ Q )]), 
and so a' (r) , b' (r) > for all r G [0, x^ Q ). 

The analytic extension of Co around x^ together with the fact that c' (x^ ) = guar- 
antees that there exists an n > so that z H> (ip^f iSo) ^ g(Ao) (z) - ip /J . Hio) s t j, g(io) (l/\pf^ ) E 
^s(&)])) n * s ana lytic in a neighborhood of l/[/xj(^ )Kl/i ff (^ )]. We shall denote S the Riemann 
surface determined by the corresponding n th root. We shall argue that, with the above no- 
tations, oj\ and UJ2 extend analytically to a piece of S which projects onto a neighborhood 
of l/[/iy>(£ ) Mfj, g (£ Q \] (of course, excluding l/[/ij(£ ) Mfx g (^ Q \]). We shall do this at first under 
the additional assumption that tpfi f(io) and Vv* 9 (£ ) do no * s h are an y critical values. Indeed, 
let us follow VV/(f; )K]M 9 (eo) = ^Vf(e ) ou;i a l° n g an arbitrary path p in 5 starting in the upper 
half-plane close enough to l/[^j(^ )KI// g (^ )]. Since arg Vv«f(£ — arg ui for w in the upper 
half-plane, VV/(£ )^ 9 («o) w * n s ^ a -^ ™ ^ ne u PP er half-plane as long as u\(z) does. Thus, 
we can then write uj\{z) = V^T^ ) (^ /(eo) ^ 9(5o) (^)) whenever u>i(,z) is still in C + as z runs 
through p. The only obstacle to the analytic extension of uj\ through a zj. is a zero deriv- 
ative of W /Ko) in the point wi(* fc ) G C+. Then ^, /Uo) (wi(«*)M(*k) = ^ /Ko)HAtg(fo) (>fc)- 
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As without loss of generality ^ {? ^{ z k) / 0, it follows that uj[ has infinite limit in 
z k . Since also ip'^ o) (uj2(z k ))uj' 2 (z k ) = ^ /{ (z k ), it follows from the S'-transform 

equation and analytic continuation that necessarily Lu 2 { z k) is infinite, and moreover, the 
zeros of V^ /(5o) anci ^ s(?o) i n u i{ z k) an d ^(zjt), respectively, must be of the same order. 
But since tpa f(Ao) Sn gm = *Pn m) ° w i = ^ g(So) ° w 2 , we conclude that W /(Co) and V Mfl(eo) 
share the critical value ipn^^Sng^ (z k ), contradicting our hypothesis. (For the origins of 
this idea, see [35J.) 

Thus, under the additional hypothesis regarding critical values, we have shown that u\ 
and co>2 extend to a simply connected domain D of 5 which they map onto V\[oJi(l/[fif^ ^M 
^9te>)])> +°°) an d V / \[oj2(1/[a* /(Co)^^s(5o)]) ' +°°)' respectively. Here V and V' are complex 
neighborhoods of cJi(l/[/iy(£ ) Kl/i ff (^ )]) and ^(l/I/^/Ko) ^^ff(So)])' respectively. Moreover, 
these extensions still satisfy the relations ^u J(io) mn g ^ o) = ^Vf(e ) ° w i = ^/V 9 (£ ) ° w 2- Since 
V ; /i/ (?0 )K/i g{Co) extends analytically to all of D, we use the fact that the moment generating 
functions increase arguments to conclude that ip^^^ and V^g^o) must extend analytically 
to some interval (ao(x^ ), ao(x^ )+e) and (6o(^£ )' ^o(^ )+ e )' respectively, and thus ao and 
bo must themselves extend analytically (and bijectively!) to some complex neighborhood 
of [0,x ?0 ]. 

We have proved our claim under the additional assumption that VV/(£ ) an d Vv 9 (£ ) 
do not share any critical values. To complete the proof of our claim that ao,bo extend 
to some complex neighborhood of [0, x^ ] regardless of this condition being fulfilled, we 
only need to observe that a translation of a measure /j by a real number k from the 
point of view of w h- > i^ u {w) = w~ 1 G u (w~ 1 ) — 1 into w \-t w~ 1 G a (w~ 1 — k) — 1. Then 
[w^ 1 G fl (w^ 1 — k) — l]' = —w~ 2 Gu(w~ 1 — k)—w~ 3 G'^(w~ 1 — k). Thus, critical values change 
continuously in k. Since there can be at most countable critical values, we conclude that 
there exists a sequence k m tending to zero so that the moment generating functions of the 
translates of A*/(£ ) by k m and VV 9 (f ) nave no common critical values. Passing to the limit 
as k m — > provides the required answer. 

But now the result under the assumption of Case 1 is proved; by part (5) of our Propo- 
sition there exists a neighborhood V of [0, x^ ] on which a n and b n extend analytically, and 
by part (1) they converge to ao and bo, respectively. By the S-transform property, c n — > cq 
on V as n — > oo, so there are points d n S (0, +oo) so that c' n (d n ) = and linin^oo d n = x^ . 
So c n {x(. n ) = lim^oo l/|>/(£ nfc ) B Ms(£„ fc )] = VL"/(£o) M Vgito)} = c o(^ )> as claimed. 
Case 2: For any x > in the domain of cq, we have c' (x) > 0. If we denote as before 
M^ to be the upper bound of the domain of Co, then co(M^ ) := lim^,/^ co(x) exists, 
belongs to (0,+oo) (although M^ might be equal to +oo) and equals l/[//y(£ ) Kl fJ- g (£ )]- 
By the S-transform equation, it follows that at least one of ao, &o must have Mg as 
upper bound of the domain of analyticity. Without loss of generality, assume that Mg = 
Mfr£ \ is the upper bound for the domain of analyticity of ao- Condition (3) implies that 
Mfu \ — > Mj(£ j as n — > oo. As the same condition holds for g, we easily conclude that 
lhm^oo min{My-(£ n ), M g ^ n ^} = M^ (limits considered in [0, +oo]). If there is an no £ N 
so that c n has no critical point in (0, min{Afj^ n ), M g ^ n ^}) for any n > no, then condition 
(4) and the S'-transform property allow us to conclude. Assume that for infinitely many n 
the function c n has a critical point in (0, min{Mj(£ n ), M g ^ n -j}); call the smallest of them 
Cn- Then we know that c n (Cn) = VL^/Kn) ^ ^g(£n)]- Since ao,6o ar e analytic on some 
complex neighborhood of [0,Mg ), condition (5) tells us that for any s £ [0, M^ ) there 
exists a neighborhood V s of [0,s] in C so that, from a certain n on, all a n ,b n have an 
analytic extension to V s . If there is a subsequence {Cn k }k which converges to a point 
r < M£ , then Cn k converges to cq uniformly on compacts of V r by condition (1) and thus 
r is a critical point of Co, contradicting the assumption of Case 2. The case when Cn 
converges to M^ as n — > oo is covered by condition (4): indeed, this condition implies 
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that Ci(Cn) = ^an(Cn)6n(Cn) -> T^^o (M & )6 (M & ) = cq(M & ) as n y oo. (Here we 
use the obvious convention = 1.) This concludes our proof. □ 

We would like to emphasize that some of the conditions of the above proposition can be 
weakened or replaced with conditions of a different nature: we use this set of conditions 
simply because it covers a conveniently large family of distributions for our purposes. 

Corollary 3.9. If /i is a fixed compactly supported probability measure on [0, +oo), a = 
(ai,...,a ro ) G [0,+oo) m \ {(0,...,0)}, t = (h,...,t m ) G (0, l) m n A m (so (h,...,t m ) 
satisfy YlJLi^j = 1)> an d v{a,t) = X^j=i^a.p then the correspondence ([0, +oo) m \ 
{(0, . . . , 0)}) x A m 9 (a, t) i— )• [/x z/(a, t)] is continuous. 

Proof. We shall apply the previous proposition, with the identifications X = ([0, +oo) m \ 
{(0, ...,0)}) x A m , / the constant function taking value fx, and g(a,t) = u(a,t) = 
YlJLi tj^Oj ■ One checks that / satisfies all conditions from the proposition above. The weak 
continuity of g is equally clear, as is the continuity of the correspondence (a, t) h-» [v(a,t)]. 
Observing that VV(a,t) maps (— oo, 1/ maxjai, . . . , a m }) monotonically and bijectively into 

( Ylj- 0,3=0 ^3 ~ 1) assures us that the upper bound of the domain of ip~^ a ^ is con- 

stantly equal to infinity, and hence continuous, and moreover, ip~, a t \ maps plus infinity 
into l/[u(a,t)], guaranteeing the continuity of fco{)(^j(a,t))i an d hence the verification 
of conditions (3) and (4). Condition (5) is verified by the constant function /. For g one 
only needs to recall the observations following equation © to note that indeed, given any 
compact subset of X, there is a complex neighbourhood of [0, +00) on which ip~r a t \ is 
analytic for all (a, t) in the given compact set. Thus, a stronger version of condition (5) is 
satisfied by g. Applying the above proposition allows us to conclude. □ 

4. Almost sure convergence of norms of random matrices 

Let GUE be the Gaussian Unitary Ensemble, i.e. the probability measure on A4 n (C) 
with support on self-adjoint matrices and density proportional to exp(— n/2 Tr(yl 2 ))A. The 
following theorem was obtained in the seminal paper [23J by Haagerup and Thorbj0rnsen: 

Theorem 4.1. Let X n ,Y n be two i.i.d GUE random variables on Ai n (C) and P be a 
non- commutative polynomial in two variables. Then, almost surely as n — >• 00, 

WPiX^Y^^WPix^W 

where x, y are free semi-circular elements in a finite von Neumann algebra. 

The aim of this section is to build on Theorem 14. 1\ and extend it to some specific 
non-commutative monomials of random matrices with prescribed spectra. 

We recall that if X is an n-dimensional self-adjoint matrix, its eigenvalue counting 
measure is n~ 1 Ya=i ^ where Aj are the eigenvalues of X. For any probability measure 
(j, on the real line, its distribution function is defined as / M : 1 1— > fi((— 00, t]). 

For the purposes of this section, we will say that a sequence of distribution functions 
/„, tends to a distribution function / iff for all e > 0, there exists an no such that for all 
n ^ n , 

(16) Vt G R, /(t-e)-E</„(t)</(t + e) + e 

Theorem 4.2. Let A n , B n be independent positive self-adjoint random matrices in M n (C), 
such that at least one of A n or B n has a distribution invariant under unitary conjugation. 
Let f n be the distribution function of A n and g n be the distribution function of B n . Assume 
that the (a priori random) distribution functions f n ,g n converge almost surely respectively 
to f,g which are distribution functions of two self-adjoint, bounded and freely independent 
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random variables x and y. Assume also that the operator norm of A n (resp. B n ) converges 
to the operator norm of x (resp. y). 
Then, almost surely as n — ^ oo, 

1 1 An -Bn 1 1 \\xy\\- 

Similar results have been obtained recently by C. Male [25]. However, our results do not 
clearly follow from his. We also believe that the above theorem could be proved directly 
with determinantal processes methods, see e.g. |21|, 126] . at least in the case where one of 
the operators is a projection. 

Note also that 6 months after the first version of this paper was completed, one author 
and C. Male used one key ingredient introduced in the proof below to prove a substantial 
extension of Theorem 14. II in the unitary case, see [T7j. The more recent main result of |17j . 
even though quite general, does not imply directly Theorem 14.21 because our assumptions 
on the spectrum of A n ,B n are not as restrictive as in [T7] . 



Proof of Theorem \4-%[ Without loss of generality, we can assume that both A n and -By- 



have distributions which are invariant under unitary conjugation (indeed, replacing the 
pair (A n ,B n ) by (V n A n V* ,V n B n V*) where V n is a Haar distributed random state inde- 
pendent from (A n ,B n ) does not change the hypotheses nor ||j4 n i? n ||, but enforces unitary 
invariance on both A n and B n . 

The main idea is to adapt Theorem 14.11 to our case by showing that in the case where 
P of Theorem 14.11 is of the form P\(x)P2{y), it extends to the situation where P\,P2 are 
any nondecreasing bounded functions. 

In this proof we consider a pair X n ,Y n G M n (C) of i.i.d GUE random matrices and 
we split our proof into three steps. In the first two steps, we show how we can replace 
in Theorem 14.11 polynomials by real, non-decreasing, cadlag, non-negative and bounded 
functions. In the third step, we show how, via functional calculus, we can modify the pair 
(X n ,Y n ) into a pair that has the same distribution as (A n ,B n ). 

Step I. First, we prove that if P is any real positive polynomial and So is a distribution 
function (real, non-decreasing, cadlag and positive), then, for all e > 0, for a fixed small 
enough neighborhood V of So, almost surely, there exists no G N such that, for all n ^ no 
and for all S G V, 

(17) | ||P(X n )S(y n )P(X n )IL - \\P(x)S(y)P(x)\\ I < e, 

were x, y are free semicircular elements in a Hi factor. 

For e > 0, we introduce the functions Sq(x) = So(x + e) + e and Sq(x) = So(x — e) — e. 
Clearly, we have Sq < Sq < Sq . Moreover, since the neighborhood V of Sq can be 
chosen as small as we need to, we can choose it in such a way that for all S G V, the 
jumping points of S are at distance at most e/100 from the jumping points of So. By 
Stone- Weierstrass theorem, there exist polynomials such that, on the interval [—3,3], 
for all S G V, Sq < Q~ < S < Q+ < S^ (see Figure©. 

The fact that almost surely the eigenvalues of X n and Y n are included in [—3, 3] as 
n — > oo implies that almost surely, for all S G V and for n large enough, S(Y n ) < Q + (Y n ). 
Therefore, almost surely for n large enough, P{X n )S{Y n )P(X n ) < P{X n )Q + (Y n )P{X n ) 
and thus, using positivity, 

||p(x„)5(y„)P(x n )|| 00 < ||p(x n )Q+(y n )p(Xn)L- 

However, P and Q + are polynomials, therefore we can use Theorem 14.11 to claim that 
||P(X n )Q + (y n )P(X n )|| 00 -> \\P{x)Q + {y)P(x)\\. We have shown that, almost surely for n 
large enough, 

limsup||P(X n )5(y n )P(X n )|| 00 ^ \\P(x)Q+(y)P(x)\\ . 
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Figure 2. Bounding distribution functions uniformly by polynomials. 



In the von Neumann algebra generated by two free semicircular elements x,y, we have the 
inequality P(x)Q + (y)P(x) ^ P(x)Sq (y)P(x), therefore 

limsup||P(X„)5(y n )P(X„)|| 00 < \\P(x)S+(y)P(x)\\ . 

n— >oo 

Since this is true for all e > and the norm is continuous according to Corollary 13.91 by 
letting e — > we get 

limsup||P(X ri )5(y n )P(X n )|| 00 < \\P(x)S(y)P(x)\\ . 

n— >oo 

A similar argument, using this time Q~ and Sq to bound from below elements S G V 
proves the other inequality and completes the first step of the proof. Note however that 
the lower bound could have been obtained without using Theorem I4.lt indeed, one can 
use Voiculescu's result for the convergence of empirical spectral distributions of random 
matrices to conclude. 

Step II. The second part of our proof is to show that that one can replace the polynomial 
P in equation (|17|) by another function T chosen from a neighborhood W of a given 
distribution function To. First, note that in Step I, one can interchange the roles of the 

1 1 2 

polynomial P and the step function S by using the C* algebra equality, ||a|| = || aa \\. 
Hence, \\S(X n )P(Y n )S(X n )\\ 00 converges to \\S(x)P(y)S(x)\\. Then, we employ the same 
technique as in Step I: we bound any element T G W by fixed polynomials P^ 1 and we use 
Step I to conclude. Note that in the first two steps of the proof we have considered GUE 
matrices X n and Y n . 

Step III. In this final step, we consider our original sequence (A n ,B n ) and show that 
our conclusion holds for it. For the purpose of its study, we introduce an auxiliary pair 
(X n ,Y n ) of two i.i.d Gaussian ensembles. It is known that with probability one, all its 
eigenvalues have multiplicity one. So without loss of generality, we will assume that our 
instance of (X n ,Y n ) does not have multiplicity in its eigenvalues. Similarly, we assume that 
the normalized eigenvalue counting function of X n and Y n converges towards the semi- 
circle and that their operator norm converges to 2. It is also possible to do so without loss 
of generality because of the well known convergence properties of the Gaussian unitary 
ensembles [23] . 

From this, it follows that there exists two non-decreasing cadlag functions f n , g n such 
that the eigenvalues of /„(!„) are the same as those of A n and the eigenvalues of g n (Y n ) 
are the same as those of B n . 

The functions f n and g n are not unique and are random, but it follows from our hy- 
potheses on the limiting distributions of A n , B n and our choice of X n , Y n that it is possible 
to make sure that f n and g n converge uniformly. 
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Let us denote by a\ ^ . . . ^ a n the eigenvalues of A n , b\ ^ . . . ^ b n the eigenvalues of 
B n , x\ > . . . > x n the eigenvalues of X n , and y% > . . . > y n the eigenvalues of Y n (note 
that we make a small abuse of notation for the sake of simplicity, and omit in the notation 
the dependence in n). It follows from the above that for all i, f n (xi) = arid g n {yi) = h- 

Next, let us introduce the decomposition A n = C/ a diag(ai, . . . ,a n )U* and similarly for 
B n , X n ,Y n . It is known that it is possible to make a choice for U a (resp. Ub, U x , U y ) that 
depends from A n (resp. B n , X n ,Y n ) in a measurable way. 

Let X n = [7 a diag(xi, . . . , x n )U* and Y n = [7 fe diag(yi, . . . , y n )U^. 

The matrices X n , Y n are random matrices and they have the property that f n (X n ) = A n 
and g n (Y n ) = B n . Besides, they are independent from each other. Finally, they both follow 
the GUE distribution because the latter is known to be determined by three criteria that 
are obviously satisfied in the construction of X n ,Y n , namely: (a) the distribution of its 
eigenvalues is the correct one, (b) its eigenvalues and its eigenvectors are independent, and 
(c) its eigenvectors are distributed according to the invariant measure. 

We conclude the proof by an application of Step II to the matrices X n ,Y n with the 
functions f n ,g n - 

□ 

5. Asymptotic behaviour of K n ^ k ,t 
We now introduce the convex body K k ,t C as follows: 
(18) %:={Ae A k | Va G A k , (A, a) ^ ||o|| (t) }, 

where (•,•) denotes the canonical scalar product in M. k . We shall show in Theorem 16.41 
that this set is intimately related to the (t)-norm: K k t is the intersection of the dual ball 
of the (t)-norm with the probability simplex Since it is defined by duality, K kt t is the 
intersection of the probability simplex with the half-spaces 

H + (a,t) = {x £ R k | (x,a) ^ ||a|| (t) } 

for all directions a G A^. Moreover, we shall show in Theorem 15.31 that every hyperplane 
H(a,t) = {x G M. k | (x,a) = ||a||m} is a supporting hyperplane for K k ,t- 

5.1. A set of probability one and statement of the results. Let (17, J 7 , P) be a 
probability space in which the sequence or random vector subspaces (V n ) n ^i is defined. 
Since we assume that the elements of this sequence are independent, we may assume 
that Q = Yln^i GrAr(C fc (8) C n ) and P = C^n^i^n where [i n is the invariant measure on 
the Grassman manifold Gr7\r(C fe (8) C n ). Let P n G M nk {C) be the random orthogonal 
projection whose image is V n . For two positive sequences (a n ) n and (b n ) n , we write 
On *C b n iff a n /b n — > as n — > oo. 

Proposition 5.1. Let v n be a sequence of integers satisfying u n n. Almost surely, the 
following holds true: for any self-adjoint matrix A G A^fc(C), the v n -th largest eigenvalues 
of P n (A®I n )P n converges to ||a||( t ) where a is the eigenvalue vector of A. This convergence 
is uniform on any compact set of Ai k (C) sa - 

Proof. For any self-adjoint A G M k (C), the almost sure convergence follows from Theorem 
14.21 and from Theorem 13.11 

Let A\ be a countable family of self-adjoint matrices in A4 k (C) and assume that their 
union is dense in the operator norm unit ball. By sigma-additivity, the property to be 
proved holds almost-surely simultaneously for all A/'s. 

This implies that the property holds for all A almost-surely, as the j-th largest eigenvalue 
of a random matrix is a Lipschitz function for the operator norm on the space of matrices. 

□ 
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The set on which the conclusion of the above proposition holds true will be denoted by 
f2' and we therefore have P(O') = 1. Technically, S7' depends on v n but in the proofs, we 
won't need to keep track of this dependence as v n will be a fixed sequence. 

The main result of our paper is the following characterization of the asymptotic behavior 
of the random set K n \^f We show that this set converges, in a very strong sense, to the 
convex body K^^. 

Theorem 5.2. Almost surely, the following holds true: 

• Let O be an open set in containing K^^t- Then, for n large enough, K n j~ t t C O. 

• Let fC be a compact set in the interior of K^^- Then, for n large enough, K, C 
Kn,k,t- 

The proof of this theorem goes according to the following non-standard scheme: the 
first inclusion follows a strategy developed in [19j and improves on it. This is the object of 
Theorem 15.41 Revisiting the strategy of proof of Theorem 15.41 gives rise to a result about 
eigenvectors of random matrices, as stated in Theorem 15.31 below, and in turn, Theorem 
15.31 is needed to prove the second part of Theorem 15.21 This is the purpose of Theorem 
EH 

Note that all the statements above are of almost sure nature. At first sight this looks 
unnatural because there is no assumption on the probability space on which the family 
of random matrices indexed by the dimension is defined. The only assumptions are on 
the n-dimensional marginals. The fact that the results hold with probably one on any 
probability probability space having the appropriate marginals follows from arguments of 
Borel-Cantelli type. 

Instead of stating a result of convergence almost surely, it is also possible, in the spirit 
of e.g. Theorem 2.1.1], to write down a theorem of convergence in probability. The 
benefit of doing so is that one does not need to bother to realize all random matrices in 
a same probability space. Such a result actually follows from the above Theorem. We 
could have chosen such an approach, but we felt that the technical details of the proof 
would have been more involved (in our proof we intersect countably many probability 
one measurable subsets of an appropriate probability space). Note also that Anderson, 
Guionnet and Zeitouni also state results of almost sure convergence (see for example [2] 
Exercise 2.1.16). Similarly, in the original results by Haagerup and ThorbjOrnsen, the 
convergence results are of almost sure nature. 

A byproduct of the first part of the above theorem, and a necessary step towards its 
second part is the following result, of independent interest in random matrix theory: 

Theorem 5.3. Consider a matrix A = diag(a) whose eigenvalue vector is a G M. k and let 
v n be a sequence of integers satisfying u n <C n. We assume that all eigenvalues of A are 
simple. 

Let be the unital eigenvector corresponding to the v n -th largest eigenvalue of P n {A® 
In)P n , which admits a singular value decomposition 

*w=£v^ c ! B) ®/* B) - 
i=i 

Then, almost surely, for each i = 1,2, . . . ,k, ef 1 ' converges to the eigenvector correspond- 
ing to the i-th largest eigenvalue of A (modulo a phase change). Moreover, if X is the 
exposed point of K^ t such that the supporting hyperplane is defined by the direction a, 
then, almost surely 

lim A (n) = A. 

This theorem has its own interest from the random matrix point of view. Indeed it 
can be seen as a law of large numbers for the U{k) /U(\) k and the M. k components of the 
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singular value decomposition of the eigenvectors. Even though many laws of large numbers 
have been obtained for eigenvalues, not much is known about the structure of eigenvectors 
(except [27], [8] and references therein). 

5.2. Upper bound. The first part of Theorem 15.21 is the following result: 

Theorem 5.4. Let O be an open set in A& containing Kk,t- Then almost surely, for n 
large enough, K n ^ t c O. 

This result provides almost surely an upper bound for the set -RT n ,fc,t- The proof of this 
theorem relies on Theorem 14.21 and on two lemmas, that are adapted from [19] and which 
we state and prove below. 

Lemma 5.5. Let Q G M n (C) be a self-adjoint projection and R G Ai n (C) be a self-adjoint 
element. Then 

(19) \\QRQ\L = vaaxTr(P x R), 

where P x denotes the orthogonal projection on the one- dimensional space Cx. 

For two matrices A, B G A4k(C), we write A ~ B if there exists a unitary operator 
U G U(k) such that A = UBU*. For a vector x G C fc C n with Schmidt coefficients 
Ai ^ A2 ^ • • • ^ Afe ^ 0, and an element a G A^, we introduce the notation 

s a (x) = aiXi + . . . + afeAfc = (a, A). 

Similarly, for a matrix A G .Mfc(C), we introduce the notation 

s A (x) := Tr(P x ■ A ® I n ) = Tr(Tr n • A), 

where Tr n = id^C^Tr is the non-normalized conditional expectation .M n fc(C) — > 7Wfc(C). 

Lemma 5.6. Let A be a self-adjoint matrix with ordered eigenvalue vector a G At. For 
each i£C fc ® C n , the following holds true: 

s a (x) = max s A (x). 
A'~A 

Proof. For two matrices A,B G .Mfc(C) with respective eigenvalues ^1 ^ . . . ^ /ifc ^ and 
Ai ^ . . . ^ Afc ^ 0, it follows from the min-max theorem that 

^^Aj^j = maxTr(A' B). 

i 

Letting B = Tr n P x , the above observation implies: 

(20) s a (x) = max Tr(UAU* Tr n = max Tr(A' Tr n P^). 

UeU(k) A'~A 

The conditional expectation property of the partial trace implies that 

(21) s a (x) = max TrfPj. • A' ® I n ) = max s A '(x). 

A'~A A'~A 

□ 

Since A; is a fixed parameter of our model, in order to compute the maximum in Lemma 
15.51 over the unitary orbit indexed by U(k), we can pick a finite but large enough number 
of elements of the corresponding orbit to obtain a good approximation of the maximum: 

Lemma 5.7. For a fixed self-adjoint matrix A G A4k(C) with eigenvalue vector a G M. k and 
for all e > 0, there exist a finite number of matrices B±, . . . ,Bi self-adjoint and conjugated 
to A, such that, for all x G C nk , 

(22) max Tr(P x • B, t ® I n ) ^ s a {x) ^ max Tr{P x ■ B, t ® I n ) + e. 
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Proof. We only need to prove the second inequality, the first one being a direct consequence 
of Lemma 15.61 Since the orbit under unitary conjugation of a self-adjoint matrix A is 
compact for the metric d(B, B') = \\B — B'\\ , for all e > there exists a covering of the 
orbit by a finite number of balls of radius e centered in B±, B2, ■ ■ ■ , B\. Fix some x G C nfc 
and consider the element B in the orbit of A for which the maximum in the definition of 
s a (x) is attained. The matrix B is inside some ball centered at Bi and we have 

Tr(P x ■ B <g> I n ) < Tr(P x ■ B t ® I n ) + |Tr [P x ■ (5< - B) <g> I n ] | 
(23) < Tr(P x . • Bi ® l n ) + - BW^ ^ Tr(P x . • B, t ® l n ) + e 

and the conclusion follows. □ 
Now we are ready to prove Theorem 15.41 



Proof of Theorem \5.4\ For a given open neighborhood O of K^j, one can find a small 
positive constant e and a finite number of ordered probability vectors ax, 02, . . . , a& G Aj; 
such that 

L L 

(24) if M c p| {z G A fc I (z^Oi) < |k|| (t )} C f] G A fc | (2^,0*) ||a;|| (t) +e} C C 
i=l i=i 

Note that only the last inclusion is non-trivial in the above equation. Consider a positive 
self-adjoint matrix A G .Mfc(C) with eigenvalue vector a G Aj; and a random vector 
space of dimension N ~ tnk. According to Theorem 14.21 almost surely, we have that 



(25) lim ||P yn -(A<8)I n ).iV n || 0O = |k| 

n— >oo 

By Lemma 15.61 for every such subspace V, one also has that 



(26) max s a (x) = max maxTr(P r • B (g) I„). 
y ; x&V K J xdV B~A V ' 

11*11=1 11*11=1 

Using the compactness argument in Lemma 15. 7\ one can consider (at a cost of e) only 
a finite number of matrices B: 

(27) max s a (x) ^ max max Tr(P x ■ (Bi <g> I n )) + e = max ||Pv n Sj 8) I„ Py n |L + £• 

x£V i=l x£V i=l 

11*11=1 11*11=1 

After after applying Theorem !4.2l to each of the pairs (Bj,Py n ), 1 < j <l, one has that, 
almost surely, 

(28) limsup max s a (x) ^ ||a|L\ + e. 

n-^oc *6^ "W 

Using L times the previous line of reasoning, by letting a = a% for i = 1, . . . ,L, we 
obtain that, almost surely, for n large enough, 

L 

(29) K nAt C p| G A fc I (z^,Oi) < ||oi|| (t) +e} C 0. 

i=l 

□ 

5.3. Lower bound. We start with the proof of Theorem 15.31 needed for the second part 
of our main result, Theorem 15.21 

Proof of Theorem \5.3l Since the set £1' introduced after Proposition 15.11 has probability 
one, we may pick a sequence (V^)neN in the set Q' defined after the Proposition 15.11 



EIGENVECTORS AND EIGENVALUES IN A RANDOM SUBSPACE OF A TENSOR PRODUCT 23 



Let us consider the eigenvector x^ of the v n -th largest eigenvalue of P n (A®l n )P n and 
write its singular value (or Schmidt) decomposition: 

j'=i 

To start, notice that since the range of the matrix P n (A ®l n )P n is a subspace of V n , 
one must have G V n . It has been shown in the proof of Theorem 15.41 that for any open 
set O containing Kkj, the probability vector A^ ra ^ is in O, for n large enough. 

Using the fact that x^ is the eigenvector corresponding to /i n , the v n -th largest eigen- 
value of P n {A <g) I n )P n , we obtain that 

P n (A ® I n )P n P x („) = fi n P x („) . 

Recall that (Proposition 15. ip /i n ^ \\ct\\r t \ — £ for n large enough, thus 

Tr(P n (A®I n )P n P x(n) ) > \\a\\ {t) TrP x(n) -e, 

where a G is the eigenvalue of A. Since x^ G V n = ImP n , it follows that P n P x („) = 
P x ( n ). In addition, using the fact that Tr P x ( n ) = 1, one obtains the following lower bound: 

s A (xW)>\\a\\ {t) -E. 

This implies that for n large enough, 

eOD{z\ (z^a) ^ ||a|| (t) -e}. 

Hence, the hyperplane H a = {z \ (z^,a) ^ |M|(t)} is a supporting hyperplane for the 
convex set Kkj C Afc. 

If z is an exposed point of i^i, denned by a hyperplane if a which intersects Kf~ t t only 
at z, then A^ n ) converges to the exposed point z, showing the first part of the result. 

Next, we study the convergence of the Schmidt vectors ej G Let B ~ A be a 
self-adjoint matrix in .Mfc(C) with same eigenvalues as A It follows from the proof of 
Theorem 15.41 that s B (x^) IMIm + e f° r large enough n. 

Hence, the function 

B^s B {x^) = Tr{B-Tr n P x(n) ) 

is 2e-close to its maximum at B = A. Using the general fact that the real function 

U(k) B U Tr^f/BCT) 

is continuous and has only one maximum, achieved when the eigenvectors of UAU* are 
parallel to the eigenvectors of B (and respecting the order of the eigenvalues), we can 
conclude the proof of the lower bound. □ 

The next result is an improvement over Theorem 15.31 and shows that we do not need to 
restrict ourselves to a single eigenvector x^ but that we can choose x in a vector space 
of arbitrary size (prescribed in advance) such that the conclusions of the above theorem 
still hold for x. This fact will be useful in the final step of the proof of the Theorem 15. 10| 
as it allows to perform a Gram-Schmidt orthogonalization procedure. 

Proposition 5.8. Let A be an exposed point of K^^t end let a be a direction of the sup- 
porting hyperplane tangent at A. Then, for any e > and any integer I, almost surely as 
n — y oo, there exists a linear subspace ofV n of dimension I such that for any norm 1 
vector x of V^, the singular values of x are e-close to A and the vectors e\ appearing in 
the singular value decomposition ^ of x are e-close to the vectors of a fixed orthonormal 
basis ofC k . 
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Proof. We prove this theorem by induction over I. For 1 = 1, this is Theorem 15.31 In the 
remainder of the proof, our standing assumption is that almost surely as n — > oo, there 
exists a linear subspace of V n of dimension I, spanned by I eigenvectors of P n (A®I n )P n , 
such that for any norm 1 vector x of V^, the singular values of x are e-close to A and the 
vectors ej appearing in the singular value decomposition of x are e-close to the vectors of 
a fixed orthonormal basis of C k . Since the singular value decomposition of vectors ([T]) is 
continuous in all of its parameters, we can assume that the subspace V' n is spanned by I 
eigenvectors yi, . . . ,yi which satisfy 

k 

VI < j < I, Vj = J2 ® fi j) + Vi ' 

i=i 

where &{ is the aforementioned fixed basis of C fc and yj is a correction of small norm: 

\\Vj II 

Our task is to find an additional vector 6 V n \ V£ such that the vector space 
V" = span{y/ +1 , V^} satisfies almost surely, as n — > oo, for any norm 1 vector x of V", 
the singular values of x are e-close to A and the C k part of its singular vectors are close to 
the ej. As stated before, we shall choose yi + i to be an eigenvector of P n (A® I n )Pn- This 
choice being made for yx, . . . , yi, it ensures the orthogonality relation yi + i _L V^. In view of 
Theorem l5.3[ for this strategy to work, we need to choose yi+\ an eigenvector corresponding 
to a large eigenvalue; this ensures that yi + ± itself satisfies the singular value and singular 
vector requirements. We now need to show that every vector of V% = span{y^ +1 , V£} = 
spanjyi, . . . satisfies the same requirements. 

In order to conclude, we need to chose an eigenvector yi + i which is orthogonal to all 
the vectors in the set 

Y = K ® fff I 1 < ii,«2 < k,l < j < I}. 

This can be done, since we may choose yi+i from a list of u n eigenvectors of P n {A® I n )P n 
(corresponding to the v n largest eigenvalues). 

Indeed, start from the simple observation that the u n eigenvectors associated with the v n 
largest eigenvalues of P n {A I n )P n (call them x±, . . . , x Un ), are orthogonal, and therefore 
satisfy the following Parseval inequality: 



£l<^,y}| 2 



i=i 



for any vector ||y|| ^ 1. Therefore it follows that there are at least v n — e 1 of them that 
satisfy 

\(xi,y)\ 2 ^£. 

Similarly, let now Y be a finite collection of norm 1 vectors. The union bound tells that 
there are at least u n — |y|e _1 of them that satisfy 

\(xi,y)\ 2 ^e 

for any y £ Y. As soon as v n > kl/e, we are guaranteed the existence of an eigenvector yi + \ 
which is almost orthogonal to all the terms appearing in the singular value decomposition 
of each of y\, . . . , yi. This implies that, for all 1 ^ i%, %i ^ hi and 1 ^ j ^ I, 

Let us now consider an arbitrary norm one vector in V% = span{yi, . . . ,yi+i} and 
compute its (approximate) singular value decomposition. Let (a±, . . . , a/+i) be a unit 
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norm vector in C' +1 . 

l+l 

3=1 

Since the vectors X^j=i a jfi J> form an orthogonal family for 1 ^ i ^ k, it follows that 
the conclusion of Proposition 15.81 holds at dimension I + 1 and with an appropriately 
updated value of the error term e. 

□ 

We also need the following elementary lemma: 

Lemma 5.9. Let F : W x R 9 — > W p be a continuous map such that F(-, 0) = id p . Let K 
be a subset of MP and K' be a compact subset of the interior of K . Then, there exists a 
neighborhood of in M. q such that for any y in this neighborhood, K' C F(K, y). 

Proof. Since K' is compact, without loss of generality we may assume that K is bounded. 
The continuity assumption on F and the boundedness of K imply that the map y i— > 
F(K, y) is continuous with respect to the Hausdorff distance. The result follows then 
readily from this observation. □ 

Finally we state a result that will complete the proof of Theorem 15.21 

Theorem 5.10. For any compact set K, contained in the interior of K^j, almost surely 
for n large enough, K, C K n ^ jt . 

Proof. We shall prove a slightly stronger version of this result. Let Vn be the subset 
of rank one selfadjoint projections of End(V n ). The inclusion V n C C k (g) C n induces a 
non-unital inclusion of matrix algebras End{V n ) C ■Mfc(C) ® Ai n (C). Let Kk,t be the 
collection of self-adjoint matrices in A^jt(C) whose eigenvalues belong to Kk,t- This is 
clearly a compact subset of j\4k(C), and it is of non-empty interior in the affine variety 
of trace one self-adjoint matrices. (Indeed, for any a G which is not a multiple of 
the identity, (l k ,a)/k < ||a||^, while when a is a multiple of the identity, the inequality 

(A, a) < \\a\\u\ is trivially satisfied for all A € so K^.t contains a neighborhood of l k /k). 
If we can prove that for any compact subset fC of the interior of Kktj with probability 
one, for n large enough, K, C (idk <8> Tr n )(7 :, jv), then the theorem will be proved. One may 
think of this new problem as a quantum version of the original problem. 

So, let us concentrate on proving this fact. In order to simplify notation, let us denote 
(idk <8> Tr^TV) by K n ,k,t- Since from any covering of a compact set by open sets one can 
extract a finite sub-covering, it is enough to prove that for any closed ball of center x and 
radius e in the interior of Kk,u almost surely for n large enough, B(x,e) is contained in 
the interior of K n ,k,t- 

Given the closed ball B(x,e), let A\,. . . ,A m be exposed points of Kf. t whose convex 
hull contains a neighborhood of B(x,e). Such Ai,... ,A m always exist because the set 
of exposed points is dense in the set of extremal points, by a result of Straszewicz ( [30J , 
Theorem 18.6). 

Let Hi £ V n be a norm one vector such that Ai is the orthogonal rank one projection 
onto Cyi. For each i S {l,...,m}, let V- be a vector subspace of dimension I (to be 

specified later) as in Proposition 15.81 Let x\ E V{ be any norm 1 vector and let f^ be 
the vectors in C n appearing in its singular value decomposition. Using Proposition 15.81 
and making an appropriate by Gram-Schmidt procedure, since the dimension I is large 
enough, we can find X2 S such that the vectors f\ G C n appearing in its Schmidt 
decomposition are all orthogonal to all i G {1, . . . , k}. 
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By induction, we can find xj E Vj such that the vectors E C n appearing in its 

Schmidt decomposition are all orthogonal to all fj? \ for all i' E {1, . . . , k} and f < j. 

For n large enough, it follows from Lemma 15.91 (and from the fact that the use of 
Proposition 15.81 ensures an appropriate convergence of the e« E C fc part of the Schmidt 
decomposition), that the collection of Schmidt vectors of a linear combination 

{ctiXi + . . . + a m x m , ^ \on\ 2 = m} 

contains B(x, e). □ 

Corollary 5.11. In the metric space of compact subsets of endowed with the Hausdorff 
distance, the distribution of dK n ^ t converges in probability to the Dirac mass on dK^ t . 

Proof It is enough to prove that the result holds almost surely. It follows from Theorem 
15.21 that for any e > 0, with probability one, for n large enough, dK n k t is included in a 
e-neighborhood of dKk t t- 

Let us prove the converse inclusion. Let x be an element in the interior of Kk t t and y be 
an element in dK^j. Our results so far imply that, for n large enough, x is an element of 
K n ^j. Let t n E M+ be the maximal number such that x + t n (y — x) E K n ^j. By the upper 
bound in Theorem 15 .2\ we have limsup n t n ^ 1. The strict inequality liminf n t n < 1 would 
yield a contradiction for the lower bound in the same theorem, therefore lim n t n = 1. This 
implies that y is in a e-neighborhood of dK n ^^. 

Since this result holds true for all boundary points y E dKf~t, the proof is complete. □ 

6. Properties of the limiting set K^t and of its dual 

In this final section we derive geometric and convexity-related properties of the set K^f. 
Since this limiting set is described via the duality equation (fT5|) . we start by investigating 
the unit ball of the (t)-norm. The reader might find it helpful to think as K^t as the 
intersection of the dual of a "ball" formed by gluing two cones along their bases (a cylinder) 
with the probability simplex A&. The two vertices correspond to the upper and lower discs 
of the cylinder, the points on the circle along which the cones are glued correspond to 
vertical segments on the vertical wall of the cylinder, while the points of the two "circles" 
bordering the upper and lower discs of the cylinder are the images of segments starting 
from the two vertices of the cones. 

6.1. Preliminary observations. Using the permutation invariance of the \\'\\m norm, 
it is clear that K^t is invariant under permutation of coordinates. We start with the 
following lemma: 

Lemma 6.1. Let C be the interior of the Weyl chamber At of the probability simplex. Let 
A E C be an exposed point of K^^ and a E a direction such that H(a,t) n K^^ = {A}. 
Then a E C . 

Proof. First, let us show that a E A^ = C . If this would not be the case, then there exists 
a direction a' E At, obtained by permuting the coordinates of a, such that 

(A, a') > (A, a). 

From this, we deduce that (A, a') > (A, a) = \\a\\f t \ = ||a'||( t ), hence A ^ H + (a',t), which 
contradicts the fact that A E K^^. 

Next, let us show that a is not degenerate, i.e. it has distinct coordinates. Should a have 
two equal coordinates, say the i-th and the j-th, let A' E Ky.^ be the vector obtained by 
permuting the i-th and the j-th coordinates in A. As before, it follows that (A, a) = (A', a) 
and thus {A, A'} C H(a,t) n Kf~t which is a contradiction. □ 
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The following proposition shows that, in a certain sense, the i-norm interpolates between 
fl and £°° norms when t G (0, 1] and x G R+. 



Proposition 6.2. For any x G R fc , ||x||( t=1 ) = ||a;||oo a- n d lim t _ s>0 + \\x\\u\ = k x | X^=i 

Proof. The first statement is just a re- phrasing of the definition of ||x||( t ) at t = 1. The 
second is a re-phrasing of the free law of large numbers: as we know from the superconver- 
gence result of Bercovici and Voiculescu [TT], ]£ Xi,X%, . . . are free i.d. random variables, 
centered at a and with variance a 2 , then 

H X 1 -a + X 2 -a+-- + X N -a = jl X x + - - + X N ^/^y -> V^O^^V? 1 (_ 2o -,2<t) ( U ) U 

in the sense that the ends of the supports of n Xl +x 2 +---+x N converge to ±2er. Tak- 

ing t = 1/N, N — > oo, contraction by 1/N of X\ + • • • + Xjy corresponds to tak- 
ing ^ x 1 - a + X 2 -a+-.+X JV -q = [1 ! , X ; +X 2 + ■ ■ ■ + X N /=v = ^ t X t +X 2 + ■ ■ EH 5 a . 

JV T Vlv v Vn ' Vn v Vn 1 

Then these measures converge to 5 a in the sense that the ends of the support converge 

to a. We obtain our result by taking X\ to be distributed according to fj, x , in which case 
a = Ar 1 Ya=i x i- 1=1 



In [19], using similar ideas, it was shown that the set K^t is included in the convex 
polytope Lp. t defined by the following sequence of linear inequalities: 



(30) (x, Vj ) < \\ yj \\ it) where Vj = l j k ~ j for j = 1, 2, . . . , k. 

This polytope was shown to be closely related to the majorization relation "-<" [12] . 
Actually, in [19], it was shown that L^^t = {x £ | x -< j3®} where 



(31) Pf 











(*) 





vi<ia. 



However, the inclusion Kf~ t t C L^^t is strict, since Kk s t is defined by a larger set of inequal- 
ities, and most of the inequalities are not redundant, as it is shown in the next section. 

6.2. Study of the geometry of K^t and of the unit ball of the (t)-norm. Next we 
shall remind the reader of a few elementary convex analysis results. First, the correspon- 
dence M. k 3 u \- > H u = {x : (u, x) = 1} is a bijection between vectors and hyperplanes in 
R fc . If A is a compact convex set whose interior contains the origin of R fc , we shall denote 
by A* its polar dual (or, for short, dual), i.e. A* = {x G R fe : (x,a) < 1 for all a G A}. An 
exposed face of A is a set An H u for some hyperplane H u with the property that (a, it) ^ 1 
for all a £ A. For any given exposed face B of A, we can define the polar face mapping of 
A 

(f(B) = {x G A* : (b,x) = 1 for b G B}. 

Then [36 1 Theorem 2.8.6] ip is an inclusion reversing bijection. Moreover, if bo belongs to 
the relative interior of B, then <p(B) = {x G A* : (bo,x) = 1} [36j Exercise 2.8.4]. We shall 
study this correspondence in more detail for the case when A is the unit ball of a norm 
(eventually of ||-||( t ))- 

We note that for a given arbitrary norm || • ||, the boundary of the unit ball d{x G 
R fc : \\x\\ = 1} is a k — 1-dimensional topological manifold, which admits projections as 
atlases. Indeed, let xq G R so that ||xo[| = 1. We claim that the projection onto {xo} -1 
of the set {x G R fc : ||x|| = 1, ||x — (x,xo)xo|| < 1, (x,xo) > 0} is a continuous bijection 
with continuous inverse. First, continuity is clear. Next, pick b G {^o}" 1 with ||6|| < 1, 
and consider b + txo, t G R. Then ||6 + txo|| > |||6|| — |i|||xo||| so there must be points t so 
that ||6 + txo|| = 1- Convexity guarantees that there are either two such points, or exactly 
one continuum of them. The second possibility is easily discarded, since there must be 
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both positive and negative such numbers, and at t = the inequality is strict. Also, only 
one of those two points satisfies (b + txo,xo) > 0, as b _L xq. Thus we have identified 
our bijection. Clearly a proper continuous bijection is a homeomorphism, so our claim is 
proved. 

Let us remind the reader the notion of gradients and subgradients. First, for a convex 
function / we define the one-sided directional derivatives of / at x relative to y by 

t i i \ r f(x + \y) - f(x) , f{x + Xy) - f(x) 
f + {x;y) = hm , f_{x;y) = hm . 

It is easy to observe that —f' + {x;—y) = fL(x;y), so that the directional derivative at 
x in the direction y exists if and only if the one-sided directional derivatives exist and 
satisfy the relation f' + (x;y) = —f' + (x;—y). The inequality f' + {x;y) > f'_(x;y) holds, and 
generally f' + (x; •) is a positively homogeneous convex function on R fc for any x. If f'(x; •) 
exists, then it is linear |36[ Theorem 5.5.2]. 

The gradient of / at x (if existing) is defined as V/(x) = (dif(x), . . . ,dkf(x)) , where 
we use the short-hand notation djf = J^. This means 

k 

(Vf(x),y)=Y,yAf(x) = f'(x;y). 
3=1 

We observe that, generally, for a norm we 

have l|s+A«||-NI ^ 

\\y\\ so (by a slight abuse of 

notation) we can write for our specific case f'(x; y) = [fL(x; y), f' + (x; y)] C [— f(y), f(y)] = 

[-llvll.llvll]. 

A subgradient of a convex function / at a point x is a vector x* G M. k so that 

f(y)-f(x)>(x*,y-x), VyeM fc . 

(For our case, \\y\\ — \\x\\ > (x*,y — x).) Geometrically, this means that h(y) = f(x) + 
(x* , y — x) is a nonvertical supporting hyperplane of the epigraph of / at the point (x, f{x)) 
[30} Section 23] . The set of all subgradients of / at x is called the subdifferential of / at 
x and is denoted by df(x). If / is differentiable, then x* is unique and x* = Vf(x), and, 
conversely, if df{x) contains exactly one point, then / is differentiable at x [6i)\ Theorem 
25.1]. 

In addition, if the correspondence x i— )■ ||x|| is differentiable around a point a^O then 
the atlas described above is differentiable around a. Indeed, let us assume x h-> is 
differentiable at a. It is clear that the derivative of this map in the direction a at a 
equals ||a||, so a is not a singular point. For x G A close enough to a its image in {a} 1 - 
is b = x — fe^y a - So the correspondence from b to x is given by an implicit equation: 
x = b + ta, t > 0. Then we write the implicit function equation for J-(b, t) = \\b + ta\\ as 
J-(b,t(b)) = 1. As we know of the existence of the solution 6(6), we only need to verify 
differentiability: dtT(b,t) = (V||(6 + ta)\\,a) in t = t{b) is well-defined by hypothesis and 
nonzero by the condition that b + t{b)a is close to a (we know that (V||a||,a) ^ ||a|| = 1 
from the subgradient inequality above evaluated in x = a and y = 0). 

The above considerations will allow us to to perform a geometric analysis of the ball of 
the (t)-norm and its dual. 

Let us now analyze the correspondence between faces in terms of their dimensions. The 
general result which is of interest for us will be stated in the following remark: 

Remark 6.3. Assume that g{x) is a norm so that g -1 (l) is the real part of an analytic 
set in the sense of |14j . Denote by A = {x G R fc : g(x) ^ 1}, and A* the unit ball in the 
dual norm. We define ip to be the polar face map from the faces of A to the faces of A* . 
Then 
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(1) If x £ dA is a point belonging to the relative interior of an exposed face B of A so 
that dA is a smooth manifold around x, then tp(B) is a point in dA* ; 

(2) If x G dA is a point belonging to the relative interior of an exposed face B of 
A where there are j G {l,...,k — 1} independent directions in which g is not 
differentiable, then <p(B) has dimension j. 

In particular, an isolated "vertex" of such a ball, where the norm function is not differ- 
entiable in any direction different from the vertex, corresponds to a piece of hyperplane 
having nonempty k — 1-dimensional interior, an "edge" - a segment included in the t-sphere 
determining only one direction of differentiability - corresponds via ip to a k— 2-dimensional 
piece and so on. The case important for us is when the unit ball is an analytic set (in the 
sense of |14|). so its points of non-smoothness are well understood in terms of dimension. 

Proof. Fix a point xq with g(xo) = 1 and let B be the face in whose relative interior xq 
lives. Recall that ip(B) = {i £ l': (x,x ) = 1, (x,a) < lVa G A} = {x G M fc : (x,x ) = 
g(xo), (x,a) < g{a)Ma G R fc }. Subtracting the two defining relations g(a) > (x, a), g(xo) = 
(x, xo) from each other gives g(a) — g(xo) > (x, a — xq) This indicates that x G (f(B) 
x G dg(xo), i.e. 

<p(B) C dg(x ). 

In particular, if g is differentiable in xq, then (p(B) contains exactly one point, as claimed 
in(l). 

We note however that evaluating g(a)—g(xo) > (x, a— xq) in a = txo gives (i— l)g(xo) > 
(t — l)(x, xq). In particular, when t = 0, we obtain —g(xo) > — (x, xq), i.e. g(xo) < (x, xq), 
and when i = 2we obtain g(xo) > (x, xq). Thus, g(xo) = (x, xo). Also, for a = b + xq we 
have g(b) > g(b + x ) - g{x ) > (x,b) for all b G R k . So dg(x ) C tp(B). Thus, 

(32) (p(B) = dg(xo) Vxo in the relative interior of the exposed face B. 

Generally, from the definition of tp(B) it follows that x G f(B) if and only if a \— > 
g(a) — (x, a) reaches a global minimum at a = xq on all of M fc . In particular, we look at a = 
xq + Ayo- Differentiation with respect to A to left and right of zero gives g'^ (xq ; yo) — (%, Do) ■ 
As xo is a point of minimum, it is clear that A \-t g(xQ + Ayo) — (x, xo + Ayo) must decrease 
as A grows to zero, and then increase after A passed the point zero. So the derivative must 
either be zero or change sign at A = 0. So g'_(xo; yo) — (x, yo) < 0, g' + (xo; yo) — (x, yo) > 0, 
i.e. (x,y ) G [g'_(x ;y Q ),g' + (x Q ;y )}. As g±(x ;-) is positively homogeneous, we may 
assume g(yo) = 1< Thus, we can write as a condition for x G f(B) 

x G tp(B) =^ (x,y ) G [gL(x ;y ),g' + (x ;yo)] for all y G R k ,g(y ) = 1, 
which means that 

(33) dg(x ) C {x G R k : g'_(x ;y ) < (x,y ) < g' + (x ;y )\fy E dA}. 

Let us note that if there are I linearly independent directions y%, . . . , y\ in {xo} -1 so that g 
is differentiable in all these directions at xq, then for any vector z G Spanjyi, . . . , yi,Xo} C 
R k , g'(xo; z) exists. Indeed, the function Spanjyi, . . . , yi, xq} 9zH g(xo + z) is still con- 
vex. The partial derivatives of this function in zero, lim^o 9 ^ Xo+ty ^~ 9 ^ x °^ , % g {1, 2, . . . , 1} 
and limf_>o 9 ^- x ° +tX( ^ 9 ^ x °^ a ll exist, so the function z h-> <7^(xo;z) satisfies — ^(xo;^) = 
ff+C^o! - z) fo r z E {xo, yi, ■ ■ ■ , y{\- Since 2 i— )• g' + (xo; z) is positively homogeneous and 
convex [6U\ Theorem 23.1], it follows from [6U\ Theorem 4.8] that z i— > g'AxQ\z) is in fact 
linear on Spanjxo, yi, ■ ■ ■ , y{\. This, according to [30, Theorem 25.2], implies that g' + (xo; •) 
is differentiable on Span{xo, yi, ■ ■ ■ , yi}. Thus, g'(xo;z) = lim^o 9( - x ° +tz ^~ 9( - x °^ exists for 
any z G Spanjyi, . . . ,y^,xo}. This indicates that whenever z G Spanjyi, . . . ,^,xo} and 
x G tp(B), (x, z) = g'(xo] z). This gives us a system of I + 1 equations with k unknowns, 
so it specifies for x exactly I + 1 degrees of freedom. So dg(xo) is contained in an affine 
variety of dimension at most k— (I + 1). 
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To complete the proof we only need to show that for any of the other k — (I + 1) 
directions, x G f(B) is free to move for a nonzero distance, i.e. that ip(B) is open in the 
k — (I + l)-dimensional affme variety in which it lives. First of all, we must note that 
for any w Spanjyi, . . . , yi, xo}, g'(xo;w) does not exist. Indeed, by [301 Theorem 4.8], 
any positively homogeneous convex function / is linear on a subspace L if and only if 
/(— x) = —f(x) for all x £ L, and this condition is true if merely f(—bi) = —f(bi) for all 
bi,...,b m forming a basis (not necessarily orthogonal!) of L. Applying this as above to 
the right derivative g' + (xo;-) we conclude that if g' + {xQ\-) is differentiate on the higher 
dimensional space Span{xo,yi, ... ,yi,w}, a contradiction. We know [3UJ Section 23] that 
dg(xo) is closed and convex, so assume that x is in the relative interior of dg(xo). Choose 
any direction z _L Spanjxo, yi, ■ ■ ■ , yi}. We claim that for \t\ small enough, x + tz £ dg(xo). 
Indeed, this is equivalent to the statement that g(xo + b) — g(xo) — (x + tz, b) > for all 
b £ R fc . As : b t- > g(xQ + b) — g(xo) — (x + tz, b) takes the value zero in b = 0, we would 
like to show that b = is a point of global minimum. In particular, we shall take the real 
function R B A h- > $(A6) and we shall decompose b = b s + b p with b s G Spanjxo, yi, ■ ■ ■ ,yi} 
and bp _L Spanjxo, y%, . . . , yi}, and, in particular, (b s ,z) = 0. We have 

$(A6) = g(x + Xb) - g(x ) - X(x + tz, b) = g(x Q + A6) - g(x ) - X(x, b) - tX(z, b p ). 

Differentiating in A gives g'± (xo + A6) — (x, b) — t(z, b p ). (We have used ± to denote that we 
consider, in the points where the derivative does not exist, the right and left derivatives; it 
is known that, A h-> g(xo+Xb) being convex, these two exist and g'_ (xo+Xb) < g' + (xo+Xb).) 
Thus, as function of A, we can state that g'± (xo + Aft) — (x, b) —t(z, b p ) is strictly increasing, 
with jump increases at the points of non-differentiability In zero, by hypothesis g-{x$; b) < 
g+(xo; b) and g-(xo; b) < (x, b) < g+(xo; b) for all b G M. k (see ([55]) ). As x is in the relative 
interior of <p(B), we have g-(xo;b) < (x,b) < g + (xo;b) for all b G We assume now 
that g{b) = 1. Then clearly for \t\ small enough, g_(xo;b) < (x,b) +t(z,b p ) < g + (xo;b) 
holds. Since both g'±{x$; •) are positively homogeneous, this is equivalent to g-(xo;hb) < 
(x,hb) +t(z,hb p ) < g + (xQ-,hb) for all h > 0. Thus, A h-> s4(xo + Xb) — (x,b) — t(z,b p ) 
changes sign exactly at A = 0. This proves our statement. □ 

We shall apply these simple observations in a corollary to the following theorem, which 
describes the unit ball of the norm 1 1 ~ 1 1 f *) (f° r a picture in the case k = 2, see Figure [1]). 

Theorem 6.4. The boundary of the unit ball in the norm (t), denoted St, is locally an- 
alytic. It can be expressed as the union of two intersecting cones, one with vertex at l k , 
and the other with vertex at (— l) k . Its points of non-analyticity are as follows: 

• When 1 — ^ < t < 1 — , then St contains exposed faces of maximum dimension 
k - j; 

• In particular, when t < \, then St contains no other segments except the ones 
connecting each point of St either with l k or with (— l) k , while if < t, then St 
is simply the boundary of the unit ball in the £°° norm on M fc . 

V \\ x \\(t) = t m i nsu PP(A i x 1// *), then x belongs to the cone with vertex at (— i) k , and if 
= t maxsupp(//x 1// *), then x belongs to the cone with vertex at l k . Moreover, if 
t < \, then ||V ||6|| {t) ||i = 1 for all b e^, b £ R • l k . 

The above theorem tells us also that whenever t < r, the norm (t) is "one segment 
away" from being strictly convex. 

Proof. With the notation t = 1/s, let us start by describing the set 

{b G R k : maxsupp(/i° lA ) ^ 1} = {b G R^ : maxsupp(/^ s ) ^ 1}. 
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To start with, we shall argue that {b G : maxsupp(/i^ s ) = 1} is an analytic set 
whenever t < h or, equivalently, s > k. (We understand this to mean that this set is part 
of a larger complex analytic set in the sense of [H].) Observe that we can view F fMb (z) as 
a function of k + 1 complex variables: 



,b k ,z) = F^ b (z) = k 



z — 6i z — 6s 



+ ••• + 



z - 6 fc 



-i 



for all z 7^ 6,- so that — j- H -r — h • • • H t— 0. We record for future reference: 

' J 2—01 2 — 02 Z — Ofc ' 

(34) 

d^z) = ^ 6 (z) 2 + ••• + 



1 



i 



i 



In particular, 
(35) 



d z F, b {z) 



Equation (fT3"|) guarantees that under our hypothesis (//&) '* has no atoms, so by Propo- 
sition 13. 6| the supremum of the support of (/x ) ffll// * is given by the largest real solu- 
tion w to the equation (d z F IXb )(w) = via the formula w + (- — l)-F Mi) (u;). We de- 
note first by w = f (bi, . . . , b k ; s) the solution of d z F^ b (w) = Our first claim is 
that the correspondence (&i, . . . , b k ; s) h-> . . . , b n ; s) is analytic in a neighborhood of 
(M fc \ {(&, . . . ,&)|& G K}) x (jfe, +oo) in (C fc \ {(6, . . . ,b)\b G C}) x C. This follows directly 
from the implicit function theorem; to prove this, we shall rather write the partial deriva- 
tives of / (for future reference) instead of just verifying the required conditions for F. 
So 



(36) d bj b k ;s) 

(37) d 8 f(b 1 ,...,b k ;s) 



{d bj d z F){b u ...,b k , f(bi, ...,b k ;s)) 

(d*F)(b 1 ,...,b k ;f(h,...,h;s)) ; 
i 

(dlF)(b l ,...,b k J(b l ,...,b k ;s))(s-ir- 



We have seen from Proposition 13.61 that, as the function w i— > F flb (w) is strictly concave 
on the (unique) unbounded interval J of analyticity containing arbitrarily large positive 
numbers, for any solution f(b\, . . . , b k ; s) G J in vectors (&i, s) ^ (b, . . . , b; s) (mean- 

ing away from the diagonal of the function (d z F)(bi, . . . , b k ; f(b\, . . . , b k ; s)) / 0, so 
we easily conclude from the analyticity of d z F that / is complex analytic around these 
points viewed as points in (C k \ {(b, . . . ,b)\b G C}) x C. The easily observed fact that 
F(b, ... ,b,z) = z — b implies immediately that / is not analytic in the variable s in points 
(b, ...,b;s). In addition, the above together with Proposition 13.61 implies that / is not 
aanalytic in any of the other variables either in the points (6, . . . , b). 
The above equalities together with equation ([35]) yield 



(38) 



3=1 



The expression for ||6||(i/ s ) (or, more precise, for t maxsupp(/i^ s )) is now written as 



,b k ;s) + ( --1 ) F(b 1 ,...,b k ,f(bi,...,b k ;s)). 



32 



SERBAN BELINSCHI, BENOiT COLLINS, AND ION NECHITA 



Differentiating this function in each coordinate bj gives 

^maxsupp(^) = d b .f(b ] s)+(±-l\ [(d bj F)(bJ(b;s)) + (d z F)(bJ(b;s))d bj f(b;s)] 

(d bj F)(b,f(b;s)). 

(We have used here that (d z F)(b, f(b; s)) = rrj.) This guarantees analyticity of the 
complex correspondence b i— > ||6||(i/ s ) on a complex neighbourhood of the whole set b G R fc 
on which the norm \\-\\u\ is achieved on the upper bound of the support of /j^ s , for s > k 
fixed. It is also remarkable that 

(39) HVlHIx/Ji = (J - l) E(^.F)(&,/(M) = " Q ~ l ) (d z F)(b,f(b;s)) = 1, 

as (d bj F)(b, f(b; s)) is easily seen to be negative from ([51]) , 

We have proved now that the set {b G : maxsupp(/u^ s ) = 1} is the real part of an 
analytic set of complex dimension k — 1 in C k . We claim that this set cannot contain 
a line that does not contain 1 . Indeed, assume towards contradiction that there exist 
b, c G R^ with maxsupp(/x^ s ) = maxsupp(/i^ s ) = 1 so that maxsupp(^ b s + ^ 1 _ M ^ c ) C {b G 

R^: maxsupp(//^ s ) = 1} for all u G [0, 1]. Then, of course, maxsupp(^k+(i_ n ) c ) C {b G 
C fe : maxsupp(^ s ) = 1} for all u G R for which maxsupp(^ 6 s + ^ 1 _ u ^ c ) is well defined, i.e. 
for all u£K. However, the set {ub + (1 — u)c: u G R} must remain included in R fc . This 
tells us that the upper bound of the support of ^b+(i~u)c mus t remain equal to one for 
all u G R. This is not possible: since b ^ c (and, moreover, the two do not differ by a 
multiple of l k ) as u tends to ±oo clearly the diameter of the support of ^ u b+(i-u)c will 
tend to infinity. If the expectation of ^ u b+(l-u)c is nonconstant (as a function of u), then 
letting u tend to infinity in the appropriate direction, we may make this expectation tend 
to plus infinity. Clearly, as the expectation of fJ^ b +(i-u)c * s s i m ply s times the expectation 
of [i ub+ (i_ u -) C , we obtain a contradiction with the upper boundedness of the support of 
^ub+(i-u)c ^ ^ e expectation of [i ub+ (i_ u y is a constant function of n, then ^ b j = Cj- 
Since b ^ c, there must be at least two distinct coordinates with differences of opposite 
signs, so when |u| — > oo, both ends of the support of n ub +(i- u ) c must tend to infinity. Thus, 
the variance of fi ub +(i- u ) c will necessarily tend to infinity. Since the variance depends 
linearly of s, it follows that the variance of /*i^+(i-u)c a ^ so tends to infinity. But this is 
impossible if the upper bound of its support is constantly equal to one and at the same 
time its first moment stays constant. 

This provided us the proof of the more difficult part of our theorem. We note next 
that at times t = j/k, j G {1,2, . . . , k}, we witness certain "phase transitions." Indeed, 
whenever t G (1 — j/k,l — (j — l)/k) for some positive integer j < k, Proposition 13.61 
part (2) and equation (fT3| guarantee that points of the form (b\, . . . , bk_j,w, . . . , w) with 
—w < bi, . . . ,bk-j < w will have norm (i) constantly equal to 1. However, smaller atoms 
will disappear, i.e. if more than k — j elements are of absolute value strictly less than w, 
the norm of this vector will be strictly smaller than 1. Thus, these points will generate a 
set (in fact an exposed face) of dimension at most k — j in the boundary of the unit ball 
of radius one in This, in particular, guarantees that for t > \\ ■ ||( t ) = || • ||oo. 

Finally, the geometry of this ball as the intersection of two cones is an immediate 
consequence of Proposition 13.31 □ 



The above theorem will allow us to draw some conclusions about the shape of the dual 
unit ball. We shall denote by C + and C~ the two closed cones with vertex at l k and 
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(-l) k respectively, so that S t = {x G R k : = 1} = C + U C~. Note that for t < \ 

the analytic set C + n C~ of real dimension k — 2 has no singularities. This follows from 
the fact that C + and C~ are parts of analytic sets which are smooth everywhere except 
for l k and (— l) k . Let us emphasize at this point that the intersection of the two cones 
does not need to be contained in a hyperplane, as it can be seen by looking at the 
large t case, when St is the l°° ball. 

Let us make a list of the smoothness at the possible faces of St'. 

(1) When t > the set ||z|| (t) = 1} is simply the i°° unit ball; 

(2) When t G (1 — j/k, 1 — (j — l)/fc), a point belonging to the relative interior of 
an exposed face of dimension k — I has k — I directions of smoothness for each 
k — 1 > I > j. There are zero dimensional exposed faces with no direction of 
smoothness along St- 

(3) When t < \, there are only exposed faces of dimension and 1. Two of the faces 
of dimension zero have exactly k — 1 violations of smoothness, and infinitely many 
ones (situated on C + n C~) have exactly one. The points in the relative interior 
of the one-dimensional faces are smooth. 

We would like to emphasize that only exposed faces of dimension 1 and k — 1 contain 
points in which St is smooth. In addition, in terms of probability measures fj, x , we note 
that all points of non-smoothness on St \ {C + PI C~) come from surviving atoms of fi^ 1 ^ . 
In particular, if t < 1 — ^ and x\ < X2 < ■ ■ ■ < x k , then St must be smooth at x. Recall 
that A = {x G R k : \\x\\^ ^ 1}, A* denotes its polar dual, and K k>t = A* n A k . 

Corollary 6.5. The faces of the set A* are as follows: 

(1) For any t G (0, 1], k G N ; the set A* contains in its boundary two exposed faces of 
dimension k — 1, namely (p({l k }) and (/?({(— l) fc }). 

(2) When t G (1 — j/k, I — (j — l)/k), the set A* has in addition exposed faces of 
dimensions I — 1 for any I G {j, . . . , k — 1}. 

(3) In particular, when t > -^r-, A* coincides with the unit ball in the norm one. 

(4) When t < \, exposed faces of A* are (I) tp({l k }) and tp({(—l) k }) which are two 
hyperplanes, (II) <p(s), where 5 is a segment uniting a vertex with a point from 
C + n C~; each <p(s) is a point, so their union is k — 2-dimensional and smooth in 
those k — 2 directions, and (III) (p({c}), for all c G C + D C~ ; since in points of 
C + n C~ the \\-\\uyunit ball is smooth in all but one direction, each (f({c}) is a 
segment, and their union is a smooth k — 1- dimensional manifold. Moreover, for 
any t < the set A* has infinitely many exposed faces of dimension zero (i.e. 
points). 

Clearly, the second part of the above corollary is not expressed in its full strength. 
However, the number of particular cases that would need to be treated make a more 
detailed discussion too involved to be worth pursuing here. Its proof is a straightforward 
consequence of the above theorem and the remarks preceding it. 

Finally, it is worth noting that ip({l k }) = {x G M: Yl x j = 1) (x,a) < 1 for all a G ^4}, 
so that Kk : t = Afe n A* C ip({l k }). A point in with strictly decreasing coordinates 
which is on the boundary of A* relative to will then be a smooth point for this 
boundary. Indeed, assume x is such a point. We know from the previous theorem and 
corollary that x cannot be a smooth point of dA*. Since it must belong to the relative 
interior of an exposed face and it does belong to the relative boundary of cp({l k }), it is 
clear that there is at least one other face of A* to which x belongs, so that there is at 
least one more point a G A \ {l k } (more precise a G ip~ 1 (B) for some face B ^ (p({l k })) 
so that ^2 XjCtj = 1 and ^ x ja>j ^ 1 for all other a G A. We claim that this point a must 
(a) be unique up to convex combinations with and (b) have decreasing coordinates. 
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Indeed, assuming we have an a satisfying these conditions which does not have decreasing 
coordinates, then we can re-arange it so that its coordinates do decrease. Its (t) norm 
will not change, but its scalar product with x will strictly increase from 1, contradicting 
the definition of A and A*. Also, + 1 — s) = 1 for all s £ [0, 1], so the lack of 

uniqueness is proved. Now, finally, we need to argue that this is the only possible lack of 
uniqueness. In order to show that, it is enough to argue that St is smooth around a, or, 
equivalently, that ||a||( f ) is not reached at an atom. If this were to happen, then we would 
have I = ai = ■ ■ ■ = ctj > aj+i > ■ ■ • > (we know that at least one of the inequalities is 
strict because a / l fc .) Then 1 = (a, x) = x\ + - ■ -+Xj + aj + \Xj + i + - ■ - + otkXk <Yl x j = 1> 
an obvious contradiction. Thus, by the Theorem l6.4[ St is smooth at a, so (p([a, l k }) = {x} 
is an exposed face. 

The above discussion has as an immediate consequence the following remark: 

Remark 6.6. Let a G C be a non- degenerate direction of the canonical Weyl chamber Ai 
and t < 1 — \. Then the set H(a, t) n Kk,t is a singleton. 

We note that this result cannot be improved, as Kj.,t = A^ when t > ^-r^- 
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